From: Andrzej J. T. <an...@ch...> - 2010-03-29 15:05:48
|
Looking to get some guidance on how big you can scale an eXist database. Right now, our instances are about 15-25K documents where each document is in the 25K-2M range, probably averaging around 150-200K. This results in a dom.dbx = 3.5G, structure.dbx = 1.8G, collections.dbx = 4.2M and values.dbx = 155M, which is not all that large compared to some relational databases. What if we scale up 10x to nearly quarter of a million documents? The file sizes still shouldn't be all that big for modern hardware, but will the performance scale linearly or close to it, assuming a powerful enough server (say a dual-cpu, 6-Core machine (12 cores, 24 native threads) with gobs of memory)? OK.....if that works how about two orders of magnitude (100x current size)? That would give us 2.5M documents, 250GB dom.dbx and a structure.dbx in the 180GB range. Bit too big or practical to cache the whole structure.dbx in memory, regardless of the size of the memory in the server. At what point do I start looking at alternative storage mechanisms, (RDBMS, Hadoop, memcached, etc.) or co-operating distributed eXist instances? Thanks for any insights from those that have pushed big databases in eXist... -- Andrzej Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |