From: Michael W. <wes...@ja...> - 2025-05-19 13:23:55
|
Hi Alberto, For me, splitting them makes them more manageable when I am going through a given collection with a WebDAV editor. For example, I have a database of baseball players. The XML file for a given player is in the format: "surname-givenname.xml." I sort them under the persons collection as: [image: image.png] Each first letter is divided into two or three letter sub-collections. I try to keep each to around 100 names each, but as the database grows, some have grown as large as 300 names. That usually means that I want to divide it up some more. (The _ collection is for names in Kanji -- Japanese characters.) The reason I break them up is because WebDAV is really slow when there are a lot of files in a single collection. If I only processed the XML files, it wouldn't be an issue. But I often go in and manually edit files, so the hierarchy helps. A quick count of the number of players I have: xquery version "3.0"; let $start-time := current-dateTime() let $players := collection('/db/uni/persons')/*:person let $count := count($players) let $end-time := current-dateTime() return <result start-time="{$start-time}" end-time="{$end-time}" count="{$count}"/> <result start-time="2025-05-19T22:20:24.288+09:00" end-time="2025-05-19T22:20:24.288+09:00" count="43434"></result> Looks like it's pretty much instantaneous to get 43,434 players. In reality, it took a couple of seconds to display the result. 2025年5月19日(月) 20:12 Alberto Simões <has...@gm...>: > Hello, Michael > > I cannot split them so that I can specify different collection names. > In that case, splitting does not bring any additional value? > > Thanks > > On Mon, May 19, 2025 at 10:25 AM Michael Westbay < > wes...@ja...> wrote: > >> Hi Alberto, >> >> collection("/db/records")/record will match all <record>...</record> >> documents under /db/records and sub-folders (sub-collections?). >> >> If you can organize them by date (year sub-folders), including that in >> the collection parameter will mean less records to search. And all >> sub-folders under that collection will still be included in the XPath >> search. >> >> >> >> 2025年5月19日(月) 17:23 Alberto Simões <has...@gm...>: >> >>> Hello >>> >>> Are there differences in terms of performance between having a large >>> collection (150k docs) with or without a folder structure? >>> >>> I want to treat them as a single collection, but I don't know if it >>> helps to have sub-collections to organise them, or if that is irrelevant to >>> eXist. >>> >>> I appreciate any help you can provide. >>> Alberto >>> >>> -- >>> Alberto Simões >>> _______________________________________________ >>> Exist-open mailing list >>> Exi...@li... >>> https://lists.sourceforge.net/lists/listinfo/exist-open >>> >> >> >> -- >> Michael Westbay >> Writer/System Administrator >> http://www.japanesebaseball.com/ >> > > > -- > Alberto Simões > -- Michael Westbay Writer/System Administrator http://www.japanesebaseball.com/ |