[Bigdata-developers] two changes, with a big difference

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Well, I made two changes today which have resulted in a whopping difference in the data load throughput for scale-out. The first was the Node#getChild(int) refactor.  The second was a configuration change to turn off nio for jeri.  I can't say which one is having the effect since I am testing with both changes, but throughput on the 16 node test cluster shot up to nearly 400k tps before falling back and leveling off at a bit about 310k tps, and climbing slowly.

The performance drop off appears to be correlated to the onset of increased index segment builds and merges, so we can probably retain that performance by addressing the remaining issues discussed in [1] (basically, better scheduling of index segment builds and merges).

I will have to go back and test with jeri nio enabled again and see if that accounts for the difference.

If not, then the difference is entirely due to removing some synchronization around reading child nodes.  If this is true, then getting rid of the synchronization on journal reads and in the cache should really boost throughput.

Bryan

[1] https://sourceforge.net/apps/trac/bigdata/ticket/20

[Bigdata-developers] two changes, with a big difference

Fast, scalable, robust graph database platform

[Bigdata-developers] two changes, with a big difference