Am 2011-01-30 21:45, schrieb Wolfgang Meier:
> I fixed various issues with different analyzers being used in lucene index
> and field definitions. The fixes have also been ported back to 1.4.x. New
> tests are available as well. If you find problems, feel free to add a test
> to one of the xml test files.
Hello Wolfgang, this is really great news! Now it looks like everything
works as advertised in the documentation and on the list. One problem
remained for me, and I have tried my best to get at a report as concise
as possible, and I think I have it ready. Its not in the form of a unit
test, but please read my verbal account:
Sometimes kwic hilite is off by one or more chars, ie. the hilight is
early. Sometimes, the query cannot return, because of an error "start
offset out of bounds [at line 216, column 5, source:
jar:file:/opt/eXist/exist.jar!/org/exist/xquery/lib/kwic.xql]". I did
some experiments and came to this diagnosis:
# If the xml document contains empty elements, eg. <el/>, this will
cause the kwic hilite be off to the left by as many characters as there
are empty elements ahead of the place the search hit.
# And if the (now early) hit location happens to be at the start of an
element, then the above error will follow.
In a more formal way:
source document: <a><b/><c>foo bar</c></a>
query for "bar" will hilite "ba"
query for "foo" will fail
no problem doc: <a><b>bla</b><c>foo bar</c></a>
query for "bar" will hilite "bar"
query for "foo" will hilite "foo"