From: Jens Ø. P. <oe...@gm...> - 2012-03-21 04:08:07
|
Hi Chris, The search returns all element nodes of w:work nodes which have "Buddhist monk" as the text value of any indexed descendant element nodes. eXist does not seem to process the search correctly - what trips it up is the "wildcard" - any element() - you feed to the full-text search. You could use "*" instead of "element()". Since the search does not return any element of any w:work element, a full-text search _is_ performed, here with a hit on w:title, and, like you, I don't see any reason why it should be different from contains(). If you formulate your query in the following way, directing your search to nodes which have been indexed, /w:work//(w:title[ft:query(., ""Buddhist monk"")] | w:creator[ft:query(., ""Buddhist monk"")] | w:catalogInfo[ft:query(., ""Buddhist monk"")]) (the element names could be generated on the fly) you will get the results you expect, but - again - what you are trying to do should work, I think. Best, Jens On Mar 16, 2012, at 2:34 PM, Chris Tomlinson wrote: > Hello, > > The configuration information is at the end. > > I have a situation in which it seems that something is going rather wrong with a path expression selecting elements using ft:query. > > The query should return a single <result/> for each match but instead is returning a result for every element in the (each) document that contains a matching element selected by ft:query. > > For this example the search string "Buddhist monk" occurs in exactly one element of one Work document in the DB. The query is: > >> xquery version "1.0"; >> >> declare namespace w="http://www.tbrc.org/models/work#"; >> >> <results type="work"> >> { >> for $node in collection("/db/tbrc/tbrc-works")/w:work//element()[ft:query(., ""Buddhist monk"")] >> let $doc := $node/ancestor::w:work >> return >> <result rid='{$doc/@RID}'> >> { $node } >> </result> >> } >> </results> > > > The results from running the above query however have a <result/> for every element in the document W21021 - the one element that should be returned is the title with the highlight below: > >> results type="work" pattern=""> >> <result rid="W21021"> >> <w:title xmlns:w="http://www.tbrc.org/models/work#" lang="tibetan" encoding="extendedWylie" type="bibliographicalTitle">'dul ba'i cho ga ma nor don gsal</w:title> >> </result> >> <result rid="W21021"> >> <w:title xmlns:w="http://www.tbrc.org/models/work#" lang="tibetan" encoding="extendedWylie" type="titlePageTitle">rab tu byung ba dang bsnyen par rdzogs pa'i cho ga ma nor don gsal</w:title> >> </result> >> <result rid="W21021"> >> <w:title xmlns:w="http://www.tbrc.org/models/work#" lang="tibetan" encoding="extendedWylie" type="subtitle">the liturgical method for conferring the monastic vows the buddhist monk according to the sa skya pa tradition</w:title> >> </result> >> <result rid="W21021"> >> <w:title xmlns:w="http://www.tbrc.org/models/work#" lang="tibetan" encoding="extendedWylie" type="coverTitle">rab byung dang bsnyen rdzogs kyi cho ga</w:title> >> </result> >> <result rid="W21021"> >> <w:info xmlns:w="http://www.tbrc.org/models/work#" nodeType="publishedWork"/> >> </result> >> <result rid="W21021"> >> <w:creator xmlns:w="http://www.tbrc.org/models/work#" type="hasMainAuthor" person="P7079">bsam gtan rgya mtsho</w:creator> >> </result> >> <result rid="W21021"> >> <w:subject xmlns:w="http://www.tbrc.org/models/work#" type="isAboutUncontrolled" class="T868">'dul ba'i cho ga</w:subject> >> </result> >> <result rid="W21021"> >> <w:hasPubInfo xmlns:w="http://www.tbrc.org/models/work#" info="MW21021"/> >> </result> >> <result rid="W21021"> >> <w:note xmlns:w="http://www.tbrc.org/models/work#">this work was written in an earth hare year at sa skya it is based on previous works by bu ston, gnyag phu bsod bzang, rje dngos grub dpal 'bar, and thar rtse nam mkha' dpal bzang there seems to be an allusion to rdo rje rin chen, which would make the work 19th century</w:note> >> </result> >> </results> > > The document is as follows: > >> <w:work xmlns:w="http://www.tbrc.org/models/work#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" RID="W21021" status="released"xsi:schemaLocation="http://www.tbrc.org/models/work# http://www.tbrc-dlms.org/tbrc-defs/1/0/work.xsd"> >> <w:title lang="tibetan" encoding="extendedWylie" type="bibliographicalTitle">'dul ba'i cho ga ma nor don gsal</w:title> >> <w:title lang="tibetan" encoding="extendedWylie" type="titlePageTitle"> >> rab tu byung ba dang bsnyen par rdzogs pa'i cho ga ma nor don gsal >> </w:title> >> <w:title lang="tibetan" encoding="extendedWylie" type="subtitle"> >> the liturgical method for conferring the monastic vows the buddhist monk according to the sa skya pa tradition >> </w:title> >> <w:title lang="tibetan" encoding="extendedWylie" type="coverTitle">rab byung dang bsnyen rdzogs kyi cho ga</w:title> >> <w:info nodeType="publishedWork"/> >> <w:creator type="hasMainAuthor" person="P7079">bsam gtan rgya mtsho</w:creator> >> <w:subject type="isAboutUncontrolled" class="T868">'dul ba'i cho ga</w:subject> >> <w:hasPubInfo info="MW21021"/> >> <w:note> >> this work was written in an earth hare year at sa skya it is based on previous works by bu ston, gnyag phu bsod bzang, rje dngos grub dpal 'bar, and thar rtse nam mkha' dpal bzang there seems to be an allusion to rdo rje rin chen, which would make the work 19th century >> </w:note> >> </w:work> > > Here is a simplified example of the above query pattern that works as expected. I've substituted contains for ft:query with the query "in" and used a simple inline document since I wanted to ensure that the query pattern was basically sound: > >> xquery version "1.0"; >> >> declare variable $x := >> <w> >> <t>Hello to you</t> >> <t>Stepping out</t> >> <n>Looking up</n> >> <i>This is the info</i> >> <c>Creator joe</c> >> </w>; >> >> <results type="w"> >> { >> for $node in $x//element()[contains(., "in")] >> return >> <result> >> { $node } >> </result> >> } >> </results> > > > This produces the following results: > >> <results type="w"> >> <result> >> <t>Stepping out</t> >> </result> >> <result> >> <n>Looking up</n> >> </result> >> <result> >> <i>This is the info</i> >> </result> >> </results> > > > which is what I was expecting from the ft:query against the production DB. Lest there be a question about how the indexes are defined for Works in the DB, I've attached the collection.xconf. > > I'm seeing the same behavior on old 1.5 trunk rev 15568 and on eXist 2.0: > >> System Status >> General >> Uptime: P1DT9H35M15.893S >> eXist Version: 2.0-tech-preview >> eXist Build: 20120304 >> eXist Home: /usr/local/indium/2.0exist/tomcat/webapps/exist/WEB-INF >> SVN Revision: 16109 >> Operating System: Mac OS X 10.7.3 x86_64 >> File encoding: UTF-8 >> Java >> Vendor: Apple Inc. >> Version: 1.6.0_29 >> Implementation: Java HotSpot(TM) 64-Bit Server VM >> Installation: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home >> Temp file path: /usr/local/indium/2.0exist/tomcat/temp >> Memory Usage >> Max. Memory: 2093888K >> Current Total: 1837036K >> Free: 1499070K > > Thank you, > Chris > > <collection.xconf> > ------------------------------------------------------------------------------ > This SF email is sponsosred by: > Try Windows Azure free for 90 days Click Here > http://p.sf.net/sfu/sfd2d-msazure_______________________________________________ > Exist-open mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-open |