From: Adam R. <ad...@ex...> - 2010-03-31 20:43:30
|
Andrezj, A chap on the mailing list has quite some experience of scaling eXist into the hundreds of gigabytes range, perhaps if you email him he could share some of his experiences with you as well. José María Fernández González. jmfernandez <at> cnb.uam.es On 29 March 2010 15:59, Andrzej Jan Taramina <an...@ch...> wrote: > Looking to get some guidance on how big you can scale an eXist database. > > Right now, our instances are about 15-25K documents where each document is in the 25K-2M range, probably averaging > around 150-200K. This results in a dom.dbx = 3.5G, structure.dbx = 1.8G, collections.dbx = 4.2M and values.dbx = 155M, > which is not all that large compared to some relational databases. > > What if we scale up 10x to nearly quarter of a million documents? The file sizes still shouldn't be all that big for > modern hardware, but will the performance scale linearly or close to it, assuming a powerful enough server (say a > dual-cpu, 6-Core machine (12 cores, 24 native threads) with gobs of memory)? > > OK.....if that works how about two orders of magnitude (100x current size)? That would give us 2.5M documents, 250GB > dom.dbx and a structure.dbx in the 180GB range. Bit too big or practical to cache the whole structure.dbx in memory, > regardless of the size of the memory in the server. > > At what point do I start looking at alternative storage mechanisms, (RDBMS, Hadoop, memcached, etc.) or co-operating > distributed eXist instances? > > Thanks for any insights from those that have pushed big databases in eXist... > > -- > Andrzej Taramina > Chaeron Corporation: Enterprise System Solutions > http://www.chaeron.com > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Exist-development mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-development > -- Adam Retter eXist Developer { United Kingdom } ad...@ex... irc://irc.freenode.net/existdb |