From: Adam H. <Ada...@be...> - 2010-07-11 11:39:58
|
Hi I have a (rather big - several hundred MB) XML of the following structure stored in the db: <krm:catalogue> <krm:j id="1637192" pID="1637191" teryt="3716704" powiat="2183789" wojewodztwo="2183783" kategoria="1"> <krm:k id="1637194"> <krm:spec id="2205172"/> <krm:spec id="2205241"/> <krm:rk id="2205699"/> <krm:rk id="2205588"/> <krm:rk id="2205589"/> <krm:rk id="2205358"/> <krm:dsc id="2205171"/> <krm:dsc id="2205239"/> </krm:k> <krm:k id="1637193"> <krm:spec id="2205213"/> <krm:rc id="2206812"/> <krm:rk id="2205564"/> <krm:dsc id="2205211"/> <krm:sc id="2206811"/> <krm:sc id="2206808"/> <krm:sc id="2206804"/> <krm:sc id="2206807"/> <krm:sc id="2206806"/> <krm:sc id="2206805"/> <krm:sc id="2206810"/> <krm:sc id="2206795"/> <krm:sc id="2206798"/> <krm:sc id="2206794"/> <krm:sc id="2206791"/> <krm:sc id="2206793"/> <krm:sc id="2206799"/> <krm:sc id="2206797"/> <krm:sc id="2206802"/> <krm:sc id="2206801"/> <krm:sc id="2206800"/> <krm:sc id="2206792"/> <krm:sc id="2206796"/> <krm:sc id="2206809"/> </krm:k> </krm:j> </krm:catalogue> where there are many nodes of type krm:j, which have many nodes of type krm:k. In a rather elaborate process, I filter those nodes, and receive a certain subtree - which, therefore, still has the same organization. And now I need to count occurrances of every @id in every given node type (ie. I need to retrieve information that node krm:sc with id 2206800 occurs 5 times, and krm:sc with id 2206801 occurs 20 times). I found out that just performing a count() on a nodeset i(XPath expression result) s surprisingly slow, and can take my xquery from below 30 seconds to well over 10 minutes. I have indices set up, which sped things up a lot in other contexts, but still - counting occurences is a big pain in the lower back. I wonder - is there any clever exist-only function which could help mi retrieve the counts? -- Best regards Adam Hepner tel. 509 093 095 e-mail: ada...@be... http://AdamHepner.pl |
From: James F. <jam...@ex...> - 2010-07-11 11:45:17
|
Adam, can you re-post this question on the open list ... the eXist development list is for development issues/tasks, etc. can I ask also that you include a representative xpath expression and xml (if the xml u provide is not representative) and I will try to give you some help. James Fuller On 11 July 2010 13:39, Adam Hepner <Ada...@be...> wrote: > Hi > I have a (rather big - several hundred MB) XML of the following > structure stored in the db: > <krm:catalogue> > <krm:j id="1637192" pID="1637191" teryt="3716704" powiat="2183789" > wojewodztwo="2183783" > kategoria="1"> > <krm:k id="1637194"> > <krm:spec id="2205172"/> > <krm:spec id="2205241"/> > <krm:rk id="2205699"/> > <krm:rk id="2205588"/> > <krm:rk id="2205589"/> > <krm:rk id="2205358"/> > <krm:dsc id="2205171"/> > <krm:dsc id="2205239"/> > </krm:k> > <krm:k id="1637193"> > <krm:spec id="2205213"/> > <krm:rc id="2206812"/> > <krm:rk id="2205564"/> > <krm:dsc id="2205211"/> > <krm:sc id="2206811"/> > <krm:sc id="2206808"/> > <krm:sc id="2206804"/> > <krm:sc id="2206807"/> > <krm:sc id="2206806"/> > <krm:sc id="2206805"/> > <krm:sc id="2206810"/> > <krm:sc id="2206795"/> > <krm:sc id="2206798"/> > <krm:sc id="2206794"/> > <krm:sc id="2206791"/> > <krm:sc id="2206793"/> > <krm:sc id="2206799"/> > <krm:sc id="2206797"/> > <krm:sc id="2206802"/> > <krm:sc id="2206801"/> > <krm:sc id="2206800"/> > <krm:sc id="2206792"/> > <krm:sc id="2206796"/> > <krm:sc id="2206809"/> > </krm:k> > </krm:j> > </krm:catalogue> > > where there are many nodes of type krm:j, which have many nodes of > type krm:k. In a rather elaborate process, I filter those nodes, and > receive a certain subtree - which, therefore, still has the same > organization. And now I need to count occurrances of every @id in > every given node type (ie. I need to retrieve information that node > krm:sc with id 2206800 occurs 5 times, and krm:sc with id 2206801 > occurs 20 times). I found out that just performing a count() on a > nodeset i(XPath expression result) s surprisingly slow, and can take > my xquery from below 30 seconds to well over 10 minutes. I have > indices set up, which sped things up a lot in other contexts, but > still - counting occurences is a big pain in the lower back. I wonder > - is there any clever exist-only function which could help mi retrieve > the counts? > > -- > Best regards > Adam Hepner > > tel. 509 093 095 > e-mail: ada...@be... > http://AdamHepner.pl > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > |