Re: [Exist-open] Indexing node content referenced by class attribute

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On 07/10/2010 18:21, Joe Wicentowski wrote:
> Hi Greg,
>
>> True, but experience shows that the reverse performs much better, eg.
>> //h1[ft:query(., $query)][@class eq 'subject']
> Great point.  In principle, though, Wolfang's Performance Tuning
> article states that filters should be in order of greatest selectivity
> (http://exist-db.org/tuning.html#N1028C).  So if only 10% of your h1
> elements have a @class predicate with the matching value, while 90%
> have a hit on the query, the class filter would be best first.  In
> deciding on the order of your filters, you need to "guess" about the
> selectivity of each one, based on your knowledge of the data and the
> likely queries.
>
> Of course, this assumes the Lucene and range indexes have identical
> efficiency, and I don't know if that's the case.
>
> (Roy - either way, you'll want a lucene index on h1, and a range index
> on @class.)
>
> Cheers,
> Joe
>
Thanks for the feedback. OK, so to extrapolate my case. I can't create 
an index just on h1 because class="subject" could define other elements 
with different tags (my source is a bit messy!). So it could be:

    <content>
    <h1 class="subject">Better flood protection for Bala</h1>
    </content>
    <content>
    <div id="item-title">
    <p class="subject">Weeding out of water invaders</p>
    </div>
    </content>

So my solution is to index content and query it thus:

    collection("/db/coll")//content[ft:query(., '"flood
    protection"')]/descendant::*[@class='subject']

This isn't quite correct though because what I really want to do is 
ft:query only within those elements where class=subject but it doesn't 
sound like that's possible in a single pass. It looks like I would be 
better off extracting the "subject" elements and normalizing them 
elsewhere in each document and creating an index around that.

-- Roy

Re: [Exist-open] Indexing node content referenced by class attribute

eXist-db is a feature rich Open Source native XML database

Re: [Exist-open] Indexing node content referenced by class attribute