Bugs item #1203250, was opened at 2005-05-17 03:20
Message generated for change (Settings changed) made by broeker
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=104664&aid=1203250&group_id=4664
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: inverted index handling
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: Darlene Wong (darlenew)
>Assigned to: Hans-Bernhard Broeker (broeker)
Summary: Find files #including "../foo.h" in -q mode
Initial Comment:
"Find files #including this file", in -q mode, will
find inclusions of "foo.h", but not "../foo.h". Is
this a known issue?
This refers to version 15.5, running on Solaris 6.
----------------------------------------------------------------------
Comment By: Neil Horman (nhorman)
Date: 2005-08-24 14:28
Message:
Logged In: YES
user_id=827328
Well, as per my previous comments I think that if you want
to search for include files using regexes, thats what the
egrep search is for, and a change in the behavior of a
search when -q is or is not used really doesn't seem like a
intentional behavior to me. And we're the developers, we
can make a change if it makes sense today.
But I'm not sufficiently motivated to force this issue one
way or the other. Since the search can be modified with
regexes to specify exactly what your looking for today, I
don't see that this is particularly critical to resolve. I
think a release note specifying the behavior would be fine
with me for the time being, just to avoid confusion for
future users.
----------------------------------------------------------------------
Comment By: Hans-Bernhard Broeker (broeker)
Date: 2005-08-24 14:06
Message:
Logged In: YES
user_id=27517
I still hold my position: while the behaviour of 'find files
including' is different between normal and inverted-index
mode of cscope, neither of the behaviours is obviously more
correct than the other, which makes it hard or impossible to
decide which one to change.
Both behaviours have advantages and disadvantages.
Searching with the regex /foo\.h/ allows to find that, but
not <sys/foo.h> or <somelibrary/foo.h>, which may very well
be what some users (e.g. Darlene Wong's) want. Searching
for /.*foo\.h.*/, as non-q mode does it, will find all kinds
of possibly unwanted things, including <../foo.h>,
<sys/foo.h> and even "bullfoo.h". I can say that especially
the latter has annoyed me occasionally: why does it list
<sys/djtypes.h> when I was searching for #includes of "types"?
Who are we to declare which of these is "correct"? What we
would need here is a statement from whoever actually made
this change, sometime between version 12.9's behaviour (as
reported by Darlene Wong) and the first opened source
package, about *why* this was changed. Failing that, I vote
for keeping this as they are, and adding a note to the docs
that if you want a search for "any include name ending with
foo.h" in -q mode, you have to type ".*foo\.h" into that
input field.
----------------------------------------------------------------------
Comment By: Neil Horman (nhorman)
Date: 2005-08-22 22:28
Message:
Logged In: YES
user_id=827328
Hans, do we have any consensus on this bug?
----------------------------------------------------------------------
Comment By: Neil Horman (nhorman)
Date: 2005-05-23 16:38
Message:
Logged In: YES
user_id=827328
"The assumption that the search is *meant* to be ignoring
directory prefixes, is, I suspect, not as true as you
believe it to be"
What makes you say that? Do you have some evidence to
suggest this?
From my research, it appears as though, if findinclude()
does not use an inverted index, it then decides in match()
if it should treat the search pattern as an regex search.
findinit() set isregexp_valid to yes, because a search for
"foo.h" compiles to a valid regex search in regcomp from
findinit. Note that in findinit the compilation of the
regex search is preformed with the REG_NOSUB flag, which I
think indicates implicit substring matches should be
returned. So it seems to me that searches for any string
that passed through findinit() were intended to be substring
matches, ostensibly to discard any leading front matter from
the string. The path findinclude follows when using an
inverted index doesn't use the remainder of the match()
infrastructure, and so ignores the use of the compiled
regex search. Given that, my patch may not be the most
correct or efficient solution (as I think I mentioned
previously), but unless you have a alternate argument, I
think preforming a substring match is the right and intended
thing to do.
----------------------------------------------------------------------
Comment By: Hans-Bernhard Broeker (broeker)
Date: 2005-05-22 17:10
Message:
Logged In: YES
user_id=27517
The assumption that the search is *meant* to be ignoring
directory prefixes in the name given to #include is, I
suspect, not as true as you believe it to be. Anyway: if a
person wants to search for "#include of any file named
"{something}foo.h", cscope's usual approach is to type in
".*foo\.h", i.e. a regexp expressing exactly that.
That's why I think the error is in findinclude(), not in the
inverted index search.
----------------------------------------------------------------------
Comment By: Neil Horman (nhorman)
Date: 2005-05-18 03:55
Message:
Logged In: YES
user_id=827328
Hans, your right, this is a descrepancy in how findterm()
searches the database, and how findinclude() searches its
database when the inverted index isn't present.
findinclude(), when not using using inverted indicies trims
the front matter by using a regex search, while findterm
does not, opting instead to preform an exact match search.
Given that you're searching for files being #included,
without regard to directory prefixes, I think the right
thing to do is what my proposed patch does, perhaps in
findterm, rather than in findinclude, but somewhere.
----------------------------------------------------------------------
Comment By: Darlene Wong (darlenew)
Date: 2005-05-18 01:42
Message:
Logged In: YES
user_id=199252
If you search on "foo.h", but it was #included as "../foo.h"
or "dir/foo.h" it will not be found. So using the -q mode
you would have to search for "../foo.h" in order to find the
file.
This problem does not occur with the 12.9 version. I did
not try it with 13, but it does also happen with 15.3.
----------------------------------------------------------------------
Comment By: Hans-Bernhard Broeker (broeker)
Date: 2005-05-18 01:14
Message:
Logged In: YES
user_id=27517
Darlene, you didn't actually say: it was not found when
running _what_ search pattern?
I'm not 100% convinced that the bug in this case is in the
-q mode. Yes, there's a difference in behaviour between the
inverted index and the normal access method. But I'm not
aware of *either* of the two different behaviour being
documented. I.e. the bug may just as well be said to be in
the documentation, or in the behaviour of non-q mode.
-q mode has to search for an exact match, or it'd lose
essentially all its applicability. Running an inverted
index search with a pattern that starts with an (implied)
".*' regex, means the search has to go through *all* postings.
----------------------------------------------------------------------
Comment By: Neil Horman (nhorman)
Date: 2005-05-17 16:50
Message:
Logged In: YES
user_id=827328
I've got a straw man patch listed in request 1203632 on the
patches page which fixes this problem for me.
----------------------------------------------------------------------
Comment By: Neil Horman (nhorman)
Date: 2005-05-17 12:47
Message:
Logged In: YES
user_id=827328
I hadn't heard of this previously, but I just confirmed it.
I'll see if I can fix it.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=104664&aid=1203250&group_id=4664
|