Menu

#88 segfault @ find.c:195

closed-invalid
None
5
2004-01-30
2002-06-11
john stultz
No

Trying to search on a bitkeeper linux tree, I get the following segfault:

Program received signal SIGSEGV, Segmentation fault.
0x08054d65 in findsymbol (
pattern=0x27004965 <Address 0x27004965 out of bounds>) at
find.c:195
195 else if (strequal(pattern, s)) {

Very repeatable, happens every time on this tree. I've deleted and
rebuilt the cscope.out file and the problem still occurs.

Cscope version: cscope-15.3-1
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)
built w/o optimization

Discussion

  • Jason Duell

    Jason Duell - 2002-06-16

    Logged In: YES
    user_id=125727

    Strange--I've used cscope to search the Linux kernel for a
    long time w/o problems. Can you tell me which symbol you
    searched for (and what kind of search)? Also, if you have
    the time to test your symptom on a regular released version
    of the Linux kernel, that would also make it easier for me
    to test the problem on my machine with the same data. If
    you don't find the problem with a regular version of the
    kernel, I guess I'll have to finally learn Bitkeeper, and
    hopefully you can tell me how to sync up to the version of
    the code where you're seeing the problem.

    Cheers,

    Jason Duell

     
  • Matthew Emmerton

    Logged In: YES
    user_id=61514

    The problem is the definition of the strequal() macro. It
    doesn't work properly if either of its arguments are NULL.

    define strequal(s1, s2) (*(s1) == *(s2) && strcmp(s1, s2) ==
    0)

    Obviously, if s1 or s2 are NULL, then *s1 or *s2 will segfault.

    The question is -- is it pattern or s that is NULL?

     
  • Hans-Bernhard Broeker

    Logged In: YES
    user_id=27517

    Note that the "pattern" pointer is scrambled, as far as gdb can tell ---
    that's whay it complains "Address 0x27004965 out of bounds".

    This means the real problem is not that strequal() doesn't check for
    NULL, here, but rather that pattern has been overwritten with garbage,
    for some reason. Without being able to reproduce the problem (and no
    feedback from the original bug reporter), it's quite impossible to hunt
    this one down.

     
  • Hans-Bernhard Broeker

    • status: open --> closed
     
  • Hans-Bernhard Broeker

    Logged In: YES
    user_id=27517

    Impossible to go on with this one without feedback from OP
    --- closing this.

     
  • Hans-Bernhard Broeker

    • assigned_to: nobody --> broeker
    • status: closed --> closed-invalid
     
  • Darryl

    Darryl - 2004-01-30

    Logged In: YES
    user_id=27401

    However, note that 0x27004965 looks suspiciously like a
    write-past-end-of-local-variable-array bug, as "0x004965"
    can be the string "eI" (for example, the end of the string
    "WriteI"). Are there any fixed-length char[] arrays nearby
    in the code?

     
  • Hans-Bernhard Broeker

    Logged In: YES
    user_id=27517

    The 'pattern' pointer that was stomped upon is on the stack.
    It's the address of the global char pattern[PATLEN+1] in
    command.c, actually. I.e. it's the call stack that was messed
    up, not some pointer variable stored in a recognizable place.
    It could be almost anything.

    In a case like this, it really is impossible to go on without
    getting further details about it --- either the person reporting
    this would have to give us a reproducible example case, or
    he'll have to step through the relevant code in a debugger
    and show us what exactly happened.

     
  • Darryl

    Darryl - 2004-02-01

    Logged In: YES
    user_id=27401

    Perhaps I'm misunderstanding you, or not writing very
    clearly, but ...

    Yes, the stomped pointer is on the stack, but it was stomped
    with data that appears to be ASCII text ("eI", followed by
    \0). This often happens if code writes past the end of a
    local variable (a "char []").

    [ Note that "pattern" is the passed-in parameter, "pattern",
    of findsymbol(), and is NOT the global variable, "pattern",
    in command.c. ]

    Looking at the source code, my guess is that something wrote
    past the end of the local variable, "file", in findsymbol()
    (about line 89 in find.c).

    Hmm. Oh, YUK. PATHLEN is #define'd as 250 in constants.h
    -- that's the bug: the user probably has a file pathname
    longer than 250 (quite easily, as linux's PATH_MAX is 1024
    (on x86, at least)). Anyway, PATHLEN should be increased,
    and be set via PATH_MAX for those platforms that #define it
    (via "#include <limits.h>", usually).

     
  • Nobody/Anonymous

    Logged In: NO

    Looks like this has been around for a while, and still
    hasn't been fixed

    not sure why it is closed?

    I typically use cscope-indexer to build the database. If I
    modify cscope-indexer to exclude SCCS, as it does RCS and
    CVS, then i no longer get the core.

    Perhaps, if nobody's gonna fix the core, we could at least
    update cscope-indexer to include SCCS in the set of patterns
    it excludes?

    Thanks,
    An Otherwise Very Satisfied cscope User

     
  • Neil Horman

    Neil Horman - 2006-06-30

    Logged In: YES
    user_id=827328

    its probably the putstring overflow. There is a patch here:
    https://sourceforge.net/tracker/index.php?func=detail&aid=1511540&group_id=4664&atid=104664
    Its got a bit of noise in it, and hans is being a bit
    obstinate about taking it, but it should apply cleaning, and
    it should fix your problem.

     

Log in to post a comment.