Indexing HTML containing <!------->
Brought to you by:
benjellr
I am indexing HTMl files some of which contain comments
like this: <!--------->.
The initial symptom is a NullPointerException and a new
'_tmpFileToIndex' file in the directory. The cause is
an org.jdom.IllegalDataException caught in
HtmlIndexer.parse and thrown from somewhere in
DomBulder.build. Thus the parse method returns null and
causes the NullPointerException.
Let me know if you need more info...
Paul
Logged In: YES
user_id=784727
Sorry, I submitted it anon. My SourceForge user is 'paulramsden'
Logged In: YES
user_id=784727
The removal of the temporary file ought perhaps to be put in
a finally block to ensure that it is always deleted. If it
is not removed it will be included in the next index run and
the problem is compounded.