Bugs item #990577, was opened at 2004-07-14 02:14
Message generated for change (Settings changed) made by broeker
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=104664&aid=990577&group_id=4664
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: inverted index handling
Group: None
>Status: Closed
>Resolution: Out of Date
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Hans-Bernhard Broeker (broeker)
Summary: Segfault in copyinverted
Initial Comment:
Submitter: stuartf@...
Building cscope of lightly patched 2.4.41-198 kernel (SLES8, SP3,
updates) and nfs-utils-1.0.6 sources, cscope 15.5 consistently
segfaulted in copyinverted. I think the stack trace is misleading
(see values for cp and blockp given below) as the last output from
ltrace was
_IO_putc('\n', 0x0829fa38) = [EOF]
which might mean the failure was in build.c:647
dbputc('\n'); /* copy the newline */
Having said that, neither *(cp +1) nor dbputc('\n') look like they should
generate a segmentation fault.
Invoked as:
cscope -bq -k -fMxS -ifiles.16780
> wc files.16780
16277 16277 878945 files.16780
> rpm -q glibc
glibc-2.2.5-213
Cscope built from source rpm on this machine, using ./configure with
no explicit parameters. 1GB RAM, 2GB swap, ulimit unlimited.
(gdb) bt
#0 0x08049709 in unlink ()
#1 0x08059a83 in myexit (sig=1) at main.c:835
#2 <signal handler called>
#3 0x0804edd9 in copyinverted () at build.c:650
#4 0x0804e3c5 in build () at build.c:398
#5 0x080587cd in main (argc=0, argv=0xbffff7b4) at main.c:536
#6 0x4007b4c2 in __libc_start_main () from /lib/i686/libc.so.6
(gdb) frame 4
#4 0x0804e3c5 in build () at build.c:398
398 copyinverted();
(gdb) print file
$1 = 0x8111420 "/usr/src/linux-2.4.21-198/arch/x86_64/kernel/irq.c"
This is file 2906 out of 16277 in the file list.
(gdb) frame 3
#3 0x0804edd9 in copyinverted () at build.c:650
650 if (*(cp + 1) == '\0') {
(gdb) print cp
$2 = 0x80a7042 "bufn,\n\n1084 \035\023\ncoun,
\035\023*\nt\n)\n\n1086 \035\005\nhexnum\n [\nHEX_DIGITS\n];
\n\n1087 \035\023\nvue\n;\n\n1088 \022\ni\n;\n\n1090 i\ncoun)
\n\n1091 \025 -\nEINVAL\n;\n\n1092 incoun > \nHEX_DIGITS\n)
\n\n1093 \ncoun = \nHEX_DIGITS\n;\n\n1094 i..
(gdb) print c
$3 = 0 '\0'
(gdb) print blockp
$4 = 0x80a7038 "\n (cڡ \005*\nbufn,\n\n1084 \035\023\ncoun,
\035\023*\nt\n)\n\n1086 \035\005\nhexnum\n [\nHEX_DIGITS\n];
\n\n1087 \035\023\nvue\n;\n\n1088 \022\ni\n;\n\n1090 i\ncoun)
\n\n1091 \025 -\nEINVAL\n;\n\n1092 incoun > \nHEX_DIGITS\n)
\n\n1093 \ncoun = \nHEX_DIGITS\n"...
dbputc('\n'); /* copy the newline */
/* get the next character */
650: if (*(cp + 1) == '\0') {
cp = readblock();
}
#define dbputc(c) (++dboffset, (void) putc(c, newrefs))
> ls -l nMxS
-rw-r--r-- 1 stuartf users 20545536 2004-07-13 14:55 nMxS
(gdb) print dboffset
$1 = 20548235
2699 bytes of buffered output lost...but we have the buffer
(gdb) print newrefs
$3 = (struct _IO_FILE *) 0x829fa38
(gdb) print newrefs->_IO_write_ptr[-1]
$6 = 10 '\n'
(gdb) print *newrefs
$4 = {_flags = -72536956,
_IO_read_ptr = 0x40015000 "\n->\nlock\n);\n\n893 \017\nd\226ay\n
= \njiff\233s\n + \nHZ\n/10; \n\t`time_a24\n(delay, jiffies); )\n\n894
\n\t`synchrze_\234q\n();\n\n899 \nv\n = 0;\n\n900 \017\ni\n = 0; i <
\nNR_IRQS\n; i++) {\n\n901 \n\234q_desc_t\n *\ndesc\n =
\n\234q_desc\n + \ni\n"...,
_IO_read_end = 0x40015000 "\n->\nlock\n);\n\n893
\017\nd\226ay\n = \njiff\233s\n + \nHZ\n/10; \n\t`time_a24\n(delay,
jiffies); )\n\n894 \n\t`synchrze_\234q\n();\n\n899 \nv\n = 0;\n\n900
\017\ni\n = 0; i < \nNR_IRQS\n; i++) {\n\n901 \n\234q_desc_t\n
*\ndesc\n = \n\234q_desc\n + \ni\n"...,
_IO_read_base = 0x40015000 "\n->\nlock\n);\n\n893
\017\nd\226ay\n = \njiff\233s\n + \nHZ\n/10; \n\t`time_a24\n(delay,
jiffies); )\n\n894 \n\t`synchrze_\234q\n();\n\n899 \nv\n = 0;\n\n900
\017\ni\n = 0; i < \nNR_IRQS\n; i++) {\n\n901 \n\234q_desc_t\n
*\ndesc\n = \n\234q_desc\n + \ni\n"...,
_IO_write_base = 0x40015000 "\n->\nlock\n);\n\n893
\017\nd\226ay\n = \njiff\233s\n + \nHZ\n/10; \n\t`time_a24\n(delay,
jiffies); )\n\n894 \n\t`synchrze_\234q\n();\n\n899 \nv\n = 0;\n\n900
\017\ni\n = 0; i < \nNR_IRQS\n; i++) {\n\n901 \n\234q_desc_t\n
*\ndesc\n = \n\234q_desc\n + \ni\n"...,
_IO_write_ptr = 0x40015a8b ";\n\n795
\n\t`\232_lock_\234qve\n(&\ndesc\n->\nlock\n,\ns\n);\n\n796
\np\n = &\ndesc\n->\na;\n\n798 \031\n\234qa * \na =
*\np\n;\n\n799 ina) {\n\n800 \031\n\234qa **\n\n =
\np\n;\n\n801 \np\n = &\na->\nt\n;\n\n802 ina->\n"...,
_IO_write_end = 0x40016000 "\nIRQ_AUTODETECT !8|jN 8|
d*\nIRQ_WAITING !8|jN 8|d*\ndesc !8|k< 8|d*\nhandler !8|k< 8|
d*\nstartup !8|k<` 8|d*\nirq !8|k< 8|d*\nspin_unlock_irqrestore !8|kd`
8|d*\ndesc !8|kd 8|d*\nlock !8|kd 8|d*\nflags "...,
_IO_buf_base = 0x40015000 "\n->\nlock\n);\n\n893 \017\nd\226ay\n
= \njiff\233s\n + \nHZ\n/10; \n\t`time_a24\n(delay, jiffies); )\n\n894
\n\t`synchrze_\234q\n();\n\n899 \nv\n = 0;\n\n900 \017\ni\n = 0; i <
\nNR_IRQS\n; i++) {\n\n901 \n\234q_desc_t\n *\ndesc\n =
\n\234q_desc\n + \ni\n"...,
_IO_buf_end = 0x40016000 "\nIRQ_AUTODETECT !8|jN 8|
d*\nIRQ_WAITING !8|jN 8|d*\ndesc !8|k< 8|d*\nhandler !8|k< 8|
d*\nstartup !8|k<` 8|d*\nirq !8|k< 8|d*\nspin_unlock_irqrestore !8|kd`
8|d*\ndesc !8|kd 8|d*\nlock !8|kd 8|d*\nflags "..., _IO_save_base =
0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0,
_markers = 0x0, _chain = 0x80ac988, _fileno = 5, _blksize =
136100160,
_old_offset = 136100320, _cur_column = 0, _vtable_offset = 0 '\0',
_shortbuf = "\b", _lock = 0x829fad0, _offset = -1, __pad1 =
0x81cbd00,
__pad2 = 0x829fae8, _mode = -1,
_unused2 = "034\b\200\034\b
034\b\034\bP034\b34\b\200034\b\020034\b
034\b0034\b\034\bp034\b\020034\b"}
----------------------------------------------------------------------
Comment By: Hans-Bernhard Broeker (broeker)
Date: 2004-07-25 17:35
Message:
Logged In: YES
user_id=27517
Identical invocations behaving differently points the blame to reading of
uninitialized memory. checker-gcc would help with that, but I don't think
it (or anything like it) is available for current installations.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2004-07-17 04:44
Message:
Logged In: NO
Not going to be much help, I can tell...
> 1) reproduce it with a smaller test case
Smallest test case I found still had 14,000 files to process.
I may have to retract slightly the original report of "consistent".
When running under gdb, several identical invocations can
have mixed success (some sigsegv, while others don't). Very
frustrating.
> 2) Tell gdb to break this *before* the signal handler
Unfortunately, not much help:
Program received signal SIGSEGV, Segmentation fault.
0x00007028 in ?? ()
(gdb) bt
#0 0x00007028 in ?? ()
Cannot access memory at address 0x3042302
> 3) See if it still happens if you leave out the -f option
In at least one case, this caused the problem to go away. However,
I found the problem to be very sensitive to the file list, phase of
moon, whatever, so I need to create some different cases that
trigger the problem and try with/without -f.
See frustrated response to #1.
> 4) See if it still happens if you use the -u flag in addition
In the same one case, this caused the problem to go away.
Same caution applies.
----------------------------------------------------------------------
Comment By: Hans-Bernhard Broeker (broeker)
Date: 2004-07-14 03:32
Message:
Logged In: YES
user_id=27517
Hm... this is a pretty weird one. By its sheer size alone,
it's almost guaranteed to be impossible for me to reproduce
here. Could you try to
1) reproduce it with a smaller test case
2) Tell gdb to break this *before* the signal handler is
called (i.e. have gdb catch the SIGSEGV), and see if the
stack trace makes more sense then?
3) See if it still happens if you leave out the -f option
4) See if it still happens if you use the -u flag in addition
?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=104664&aid=990577&group_id=4664
|