Hello,
I'm seeing the "Internal error" which is also mentioned in bug #274. Here is how I'm getting the error:
Note that the error only happens for header.h -- the lookup works successfully for all of the other headers in my project.
I've found a couple ways to work around this:
With either of these methods, I don't see the error for header.h (or any other header file). The first method seems like the easier one, and the slowdown without the inverse DB isn't too painful.
I'm wondering if perhaps the inverse DB is deprecated or not often used?
I've tried running both sides (DB creation and DB lookup) through purify, and didn't see anything much. When creating the DB, purify reports an "uninitialized memory read" in invmake (when calling fflush() after "write out junk to fill log blk"), so for fun I tried commenting out that code. Purify then no longer complains, but I still get the Internal error when doing a lookup for header.h. So this UMR is probably just a red herring.
Any other thoughts would be welcome. I notice in bug #274 there's a reference to a "patch mentioned in bug report #3528987", but I'm not sure where to find that patch.
Thanks!
-Jason
That part alone already makes your report practically impossible to work on: I don't have those 2000 files, and since it appears I need at least more than half of those, there's just no sensible way I can investigate it.
Does a search for all headers at once (i.e. -L8 '.*') work, too?
No to both.
Patch numbers were lost in SourceForge's site revamp. The whole bug tracker is new.
Last edit: Hans-Bernhard Broeker 2013-06-26
(I'm too dumb to properly format the above. That should add to the fun when determining whether my analysis is correct.)
Unfortunately, it appears this recipe is still crucially incomplete to reproduce the problem. I tweaked a source code to fulfill condition 0) through 2), but still couln't get it to trigger the error message.
0) I noticed this on an (inverted) cscope DB for a recent Linux kernel (ie, the three files involved weigh over one GB). So I really hope I can describe this in enough detail for you to reproduce.
1) What seems to be involved in my case is the following:
- the linenumber referenced for the first[*] entry for a term in the postings file starts at a 8192 block boundary, as I reported before;
- the function referenced in that posting entry is zero, so it's a global term (otherwise the previous block, which would contain that function, will still be available in memory?);
- the "mark" for that posting is not a '$' or a '~', so the posting doesn't reference a function definition or an include. (The problematic marks were either ' ', '#', 'e', 'g', 'm', 's', or 't'.)
If the above is true the error triggers with
cscope -d -L -0$term
2) Does this make sense? Can you reproduce now?
0) I've been trying to pinpoint things further. Things seem to boil down to this:
1) The non-failing terms (that also have their first line start at a block boundary in my tests!) pass these function calls:
find_symbol_or_assignment()
putpostingref()
fetch_string_from_dbase() / because either p->fcnoffset == 0 && p->type == FCNDEF
* or p->fcnoffset != lastfcnoffset /
putref()
putsource()
The call to fetch_string_from_dbase() contains setmark('\n'). This sets blockmark to '\n'. So when putsource() is entered blockmark will be '\n'. And that will, by what looks like pure chance, make the test for two newlines preceding the line number work. (This test should check the last two characters of the preceding block. But it actually also looks at the blockmark. Why is that?)
The failing terms (remember that they have their first line start at a block boundary) lack the calls to fetch_string_from_dbase() because they're neither a FCNDEF (mark character isn't '$') and they're global (fcnoffset == 0). So when putsource() is entered blockmark is still NUL.
2) Tested this under gdb with a watch on blockmark and a breakpoint on putsource(). Setting blockmark to '\n' when one hits the putsource() breakpoint makes the error disappear. QED.
3) Agree?
Yes, I understand. If I could replicate with a smaller set of files I would, but it seems to be something that's triggered by a large number of input files.
Just tried this, it works -- it finds 5 instances of header.h, and no errors.
I wouldn't think it's a problem parsing my source files, since that process is successful (using the -q flag) when the filelist is broken up into two smaller lists. And since the all-header query above was successful, this seems to imply that the DB (and .in and .po) are correct, is that right? So perhaps it's a problem with the query, but why would only header.h cause the problem?
I'm debugging the query right now, but haven't got very far. The stack is:
=>[1] putsource()
[2] putref()
[3] putpostingref()
[4] findinclude()
[5] search()
[6] main()
In putsource(), it goes into the "read the previous block" code when running for header.h. At this point, I have blocknumber = 4441. Going up to postingref(), I see:
*p = {
lineoffset = 4547584
fcnoffset = 0
fileindex = 912
type = 126
}
So it looks like block 4441 is the last block (4441 * 1024 = 4547584). Could this be related?
Thanks for the quick response,
-Jason
I think I have tracked down the problem based upon Paul's analysis above. The base issue is that in the case of doing searches in the inverted database, the "blockmark" global variable was not being set using the setmark macro before doing the search. This leads to read_block reading in a block and putting 2 \0 at the end instead of only 1, (normally it ends in the value in blockmark followed by a \0. This in turn leads to the getrefchar macro skipping the last character of the block since it sees (++blockp +1)=="\0" at the next to last character in the buffer.
The good news is that the fix is easy, just add a setmark('\n'); in find_symbol_or_assignment() before doing the search in the inverted database.
I created a c file that can be used to show the problem, but there is one issue with it. Since the cscope.out file stores the path to the directory where it was built, you have to edit the source file until you get the bug line to start at character 8192. I did this by trial and error. I'll attach the file, and describe the process to re-create the error:
cscope -bcqk cstest.c
in the cscope.out find where:
118 int
gbug_line
;
starts. I did this by removing the 118 int line and everything after it, and running wc on the result. If it was off by a few bytes, I editted the source until wc read 8192. After that, I created the database again, and did a
cscope -d -L0bug_line
and got the fialure.
Oops, forgot the cstest.c file