While attempting to index a source tree, I'm getting the following error:
Can't use an undefined value as a symbol reference at C:/Perl/site/lib/File/MMagic.pm line 587.
This always happens on the same file. However, when I change the sourceroot to start further down in the tree, it happily goes right past this file. So I think the issue is not with the file itself, but with perhaps with free resources. I'm running this on a Windows Server 2003 box with MySQL 5.5.5 in support. Line 587 in MMagic is a binmode call on a file handle. It appears as if, for some reason, this file handle is undefined at this point. Any suggestions on what could be causing this type of behavior?
Can you be more specific? In my instance of MMagic.pm, line 587 is not a "binmode($fh)". The nearest is at line 576 (in sub checktype_filehandle) with comment "for MS Win32 architecture". Line 342 (after all licence statements) says $VERSION="1.27". Is it same as yours?
When does this error happen? During genxref step, I guess. Please describe your source tree architecture: simple flat directory or tree of directories (how deep?), which change to make the error disappear, type of offending file, …
I suppose you're using swish-e as a search engine since MMagic is used only with swish-e.
The offending filehandle is computed immediately before testing with checktype_filehandle($fh). Supposing your source tree is made up of plain files (not CVS repository), I had a look at getfilehandle in Plain.pm. You may get an undefined filehandle if you have 'ignoredirs' directories in lxr.conf. Normally, uses of the filehandle are protected by tests but in this case I am not sure that pre-testing the filesize on an undef handle does work. I'll check later.
Give me more information. Regards,
I checked Perl behaviour about "undef > 0". It returns false (as expected), then the call to "checktype_filehandle()" is rightly protected. Got to look somewher else.
I forgot to ask for LXR version.
Thanks for the quick reply. The version of MMagic.pm I have is 1.29 and yes, you are correct, I'm encountering this while running genxref.
I had actually typed up a long response to this before I realized something that I had looked over before which is more than likely causing this. I am executing genxref against a dynamic ClearCase view. This particular sub-directory has multiple elements, two of which have the same name, but vary in case…one of which is a directory and the other is a binary file. For instance, here is something similar:
In the example above, I have a sub-directory named 'c' and a binary file named 'C'. While *nix-based systems shouldn't have an issue with this, I'm expecting that Windows is. Although, interestingly enough, when looking at this in a DOS window, Windows does identify the elements correctly, one as a file, the other as a directory. Another interesting note is that running genxref in a cygwin terminal (where you'd expect this NOT to be an issue), produces the same error.
But the thing that really confuses me is why this error doesn't always show up?? If I run genxref from lower in the tree (but still above this offending area), it runs through all of this directory's contents fine. So perhaps it doesn't have to do with the issue I noted above. Hmmm.
I may try to create a snapshot view today and test against that…just in case this is some sort of ClearCase weirdness. Other than that, I'm stumped.
To answer some of your other questions, this is a tree of directories source tree, with levels deeper than the offending one. Other than add the offending sub-directory to the ignore list, or starting genxref from lower in the tree, I cannot make this issue disappear. I did track the undef file handle back to the "new FileHandle…" call in Plain.pm (line 72 I believe). This is returning undef. When printing $!, I get that the error is "Permission Denied." That doesn't tell me much since, again, I can run it from lower in the tree and this area parses correctly.
I think I'll add the directory to the ignore list and see if it completes the entire source tree. Then I'll try the snapshot view…unless of course you have replied with something else before then.
Thanks again for the help!
LXR version is 0.11.1 against MySQL 5.5.5.
Hmmm…now I'm getting indeterminate (from my perspective) behavior. I added the sub-directory in question to the ignore list and let the parsing proceed. It got MUCH further this time, but got into an infinite loop printing out "fragment without terminator." ??? This comes from SimpleParse.pm, but I'm not sure why it would get into an infinite loop printing it out. So I stopped it, and removed the offending directory (initial problem) from the ignore list. I then re-ran genxref and this time it did NOT have a problem with the sub-directory, parses it successfully, and continued on. Unfortunately though, it also eventually ended (in the exact same place) with the infinite loop.
So to recap:
1) Added offending sub-dir to ignore list
2) Ran genxref
3) Eventually received infinite loop
4) Removed offending sub-dir from ignore list
5) Ran genxref
6) Had no problem parsing that directory this time, but failed with same infinite loop
I think I'm going to test this with a snapshot view, as it's something different that I haven't tried yet.
The genxref line I'm running, in case this helps, is:
perl genxref -url=http://localhost/lxr -reindexall -allversions
Note that there is only 1 version though.
OK, so on the "fragment without terminator" issue, I think it's my fortran settings in filetype.conf and generic.conf. I just uncommented what was there that pertained to fortran…perhaps that was not a good thing to do. When I put those back, with the fortran parts commented out, it worked (although no links). I'll look for examples of valid fortran settings.
OK, so I got my fortran settings figured out and it fixed the "fragment without terminator" issue. I had to use (part of) the patch posted here: https://sourceforge.net/tracker/?func=detail&aid=1832710&group_id=27350&atid=390119. Just un-commenting out the fortran settings produced the fragment error.
Can't try the snapshot view today (to fix the first problem)…will have to wait till tomorrow. But if you have anything that you want me to try, please let me know.
SimpleParse.pm is a nightmare. It tries to parse languages with pattern-matching instead of good ol' deterministic finite state automaton. I changed it a bit when I took over LXR to remove some (parsing) insecurities but it is not bullet-proof. I even recently discovered more bugs in it and I patched them in preparation for 1.0 release, but I'm seriously considering rewriting this part as a DFSA (for correctness) in C (for speed - think of the time needed to collect references in the kernel!).
Fortran description is really skeletal (you notice there is no 'spec' => description). I'm then not surprised at an error (such as your endless loop). It is on my Todo list, but Fortran case is difficult since there is no reserved keywords. Symbols like GOTO, CALL, RETURN, … are considered statement names only by context. You can use them as variable names, though this is bad practice. The worst statement is DO, you need look-ahead to decide whether it is a variable assignment or a loop statement (the decision criterion is a comma outside parentheses). I'm not sure the language is parseable with (stateless) regexps.
An interim strategy would be to consider the full file as a "code" fragment and let sub processcode enhance the declared variables and functions since ctags would have gathered the declarations. Of course, the hyperlink would reference only these declarations. I don't think SimpleParse.pm and Generic.pm are both suited for Fortran.