|
From: Jean-Paul R. <re...@gm...> - 2025-10-31 14:14:52
|
Dear Lars, There are two possible issues at work. The first is subtle and I only discovered it over time: computed fields in eXist-db use the Lucene index, which in turn requires a text() node to be available on the element containing an attribute you may wish to index. A sample example demonstrates this: <item corresp="foo"/> -> cannot compute a field from the value in @corresp <item corresp="foo">bar</item> -> can compute a field from the value in @corresp This may or may not affect your indexing situation. Compounded with this is a second issue that you indicate. An index being performed on a given document (say "X") cannot access the rest of the contents held in the same immediate collection that contains "X". Thus computed fields fail to populate when referring to documents in the same immediate collection. My own work-around is to create separate collections for documents which may be needed for indexing. For example, if I have a corpus of TEI letters in collection /data/corpus, I will keep separate xml files containing, for example, my ListPerson or List Place in /data/listPerson and /data/listPlace respectively. This makes them fully available for use in computed fields. They cannot be collections within collections. Note as well, that if one triggers an index on all of /data/ with a collection.xconf in /data/ one would face the same problem. Therefore each collection needs its own collection.xconf. If your situation is like mine the first time, it means rebuilding the collections, and refactoring and rewiring some code (and now I keep all my XQuery paths in "global-like" variables so I only touch them in one place if a change is needed.) I hope this helps. Best, JPR On Fri, Oct 31, 2025 at 12:21 PM Lars Scheideler <sch...@sa...> wrote: > Hello, > > I want to index a collection and need a field that contains specific > computed values from the same collection. > I have various TEI XML files with different biblStruct types. Some of > these files refer to other biblStructs via ref elements. I need the IDs of > the biblStructs that point to the current, indexed biblStruct. > However, when I reindex the collection, the fields almost exclusively > contain “empty” values. > > Here the code: > > <collection xmlns="http://exist-db.org/collection-config/1.0" > <http://exist-db.org/collection-config/1.0>> > <index xmlns:tei="http://www.tei-c.org/ns/1.0" > <http://www.tei-c.org/ns/1.0> xmlns:xs="http://www.w3.org/2001/XMLSchema" > <http://www.w3.org/2001/XMLSchema>> > <lucene> > <module uri="http://application.de/xquery/facet-utils" > <http://application.de/xquery/facet-utils> prefix="fu" at= > "xmldb:exist:///db/apps/application/modules/index-facet-utils.xqm"/> > <text qname="tei:biblStruct"> > <field name="workgroupid" expression= > "fu:get-workgroup-id(.)"/> > </text> > </lucene> > </index> > </collection> > > > declare function fu:get-workgroup-id($biblStruct as element()) as > xs:string* { > let $corresp as xs:string := $biblStruct/@corresp/string() > let $workgroupsIds as xs:string* := > collection("/db/projects/application/data/bibls")//tei:biblStruct[@type > eq 'dataset'][.//tei:ref[@type eq 'relDocRef']/@target/string() = > $corresp]/@xml:id/string() > let $workgroupsIdsJoined := string-join($workgroupsIds,';') > return > if($workgroupsIdsJoined ne "") then $workgroupsIdsJoined else > "empty" > }; > > I assume that I do not have access to the same collection that is actually > indexed or to specific attributes/elements (this case is not > described/addressed/mentioned in either the documentation or the book), but > the function takes far too long for live productive use. > > Thanks for the help. > > Best regards > _______________________________________________ > Exist-open mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-open > |