From: Marcus A. <ma...@br...> - 2004-02-15 01:22:43
|
Hi again I have run to into some additional problems regarding performance after some more experimenting. I have a document in the database with data as the following: <companies> <company someattrs...> <addresses> <address> elts for street and so on <zipcode>555 66</zipcode> <city>A city name</city> ... </address> ... more addresses for the company </addresses> ... more company data (not that much though) </company> ... about 6000 more companies... </companies> I want to select all city names specified by parts of zipcode, then select all distinct values of the city names and after that count all companies with addresses in that specific city. I have created the following query to make it happen: ... declarations... declare function fn:getCities($zip as xs:string) as element()* { let $comps := doc('/db/iso-sales.xml')//company let $cityNames := ( for $cityName in distinct-values($comps[starts-with(.//zipcode, $zip)]//city) order by $cityName return $cityName ) for $cityName in $cityNames return <city> { $cityName } : ({ count($comps[.//city = $cityName]) })</city> }; <cities> { fn:getCities(request:request-parameter("zip", "5")) } </cities> The query works but is terribly slow (~90-110 seconds on a 1GHz/512Mb (default settings with Tomcat 5.16 and JRE1.4.2_03 though so the heap could be set bigger I guess but it doesn't crash...) machine and depending on how many companies there are in the selected range). If I change $zip, to for example '5', where I select the cityNames, the time goes down to around 30-40 seconds (but it's still very slow from my point of view). Is there some way to make the query faster? Should I structure the data differently (and in that case I guess other use cases should be considered)? When I made some earlier experimenting, as mentioned in a previous mail, I succeeded in making parts of the query faster (the first part where I select the city names based on the zipcode) by building a string where I inserted the zipcode as a static value and then called util:eval on the string. But I really don't know how to take a similar approach when I want to do the counting as well. I guess another approach would be to select all companies based on the zipcode and then pass that result into an XSLT but: 1) It doesn't feel like I'm using the power I assume XQuery have 2) It feels like I make it a lot harder than it should be 3) It just doesn't feel right The feeling I have so far is that "static" queries perform fast and "dynamic" queries, where one or more values of the query is based on a variable (excluding variables consisting of nodes from a previous query?), have issues regarding performance. Is it a correct feeling? I understand that "dynamic" queries should/must be slower but there are several magnitudes of difference so I just have to do something wrong. If there is no way to make it faster I would be glad to know that as well since I will then know that eXist isn't for me and my needs... :( Thanks in advance for any answers. I also apologize if this issue has been covered in earlier mails. /Marcus |