|
From: Nick S. <nsi...@nu...> - 2021-09-26 21:56:21
|
I brought this up on the last community call but I wanted to send the
e-mail to see if the wider community audience might have some suggestions.
I had my eXist-db v4.5 crash a number of weeks ago. I restarted the
system, it ran through the data integrity check and everything seemed to
be OK. However, later, I noticed one of the files in my system that was
not indexed correctly. I could tell this because my file processing
script could find the file in the system but I was getting an error that
did not make any sense.
So, what I decided to do next was to reindex the entire database
(xmldb:reindex('/db')). I tried running my script again and I still got
the same error. So, next I located the file in the system, opened up the
file and made a minor change to the XML file (The small change I made
was I added a white space to the end of the file, but I could have made
any change.) Then I saved the file. After that, I ran my script and
everything worked as expected.
What I believe I understand about this issue is that there is something
that happens when you save a file, which forces a re-index of a file,
which is different then if you run xmldb:reindex(). Also, I have
experimented with downloading a collection of documents, then deleting
the collection in the database and then uploading the collection to the
database using WebDAV. This technique also does not solve the issue.
There is something about opening the file, changing it and then saving
it that resolves this issue. I was posting this because I see this issue
every 6 months or so. Whenever, I am completely stumped by a script not
running properly, I have learned to locate the file, modify it in a
minor way and resave it and very frequently it resolves my issue.
When I brought this up on the community call, I was given 2 suggestions.
First, they thought maybe the XML files had a BOM character in them.
This is not the case for me because we are not working with any Windows
servers. I have seen this BOM character issue before and so I can
confidently say, this is not related to a BOM character.
The second suggestion was that maybe our index was indexing an attribute
instead of a tag. I checked my index and I don't think this is the case.
My index file is below.
<collection xmlns="http://exist-db.org/collection-config/1.0">
<index>
<!-- Disable the old full text index -->
<fulltext default="none" attributes="false"/>
<!-- New full text index based on Lucene indexes for full names
and song titles -->
<lucene>
<analyzer
class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
<analyzer id="ws"
class="org.apache.lucene.analysis.WhitespaceAnalyzer"/>
</lucene>
<!-- Range indexes for fast ID lookups -->
<range>
<create qname="next-activity" type="xs:string"/>
<create qname="data-exchange-id" type="xs:string"/>
<create qname="data-origin-party-id" type="xs:string"/>
<create qname="data-destination-party-id" type="xs:string"/>
<create qname="release-id" type="xs:string"/>
<create qname="search" type="xs:string"/>
<create qname="delivery-type" type="xs:string"/>
<create qname="queue-date-time" type="xs:string"/>
<create qname="delivery-date-time" type="xs:string"/>
<create qname="workflow-status" type="xs:string"/>
<create qname="event-timestamp" type="xs:dateTime"/>
</range>
</index>
</collection>
Any thoughts or suggestions are welcome!
Nick
--
Nick Sincaglia
President/Founder
NueMeta, LLC
Digital Media & Technology
Phone: +1-630-303-7035
nsi...@nu...
http://www.nuemeta.com
Skype: nsincaglia
|