From: Jacob D. <jac...@ci...> - 2021-10-25 18:44:00
|
Hi everyone, I’m testing a moosefs setup and am trying to understand the performance values I get and the possible bottlenecks. Tried out quite some things but am a little stuck. Status quo for test setup is: Mfs master server - 10G Interface - Mfsmaster config is default Mfs chunk server - 100G Interface - 8x Sata SSD Raid - Mfschunkserver config default To rule out any client or additional network issues I’m testing on the chunk server. Testing with fio gives me a stable 500MB/s read and 700MB/s write. (“sudo fio --filename=/mnt/mfs_mount/123 --direct=1 --rw=read --bs=64k --ioengine=libaio --iodepth=64 --runtime=30 --numjobs=1 --time_based --group_reporting --name=throughput-test-job --eta-newline=1 --size=50m”) Same test on the xfs gives me about 8GB/s read and 5GB/s write. Utilization of the ssd array is zero during testing, so everything seems to be handled in cache as fio probably deletes everything instantly. Testings: Seeing the xfs performance reserve and the idling array we tried to get rid of the cache with “mfsmount /mnt/mfs_mount/ -H mfsmaster -o mfscachemode=DIRECT”, which gave us the same results. Trying to increase the cache instead with -o mfsreadaheadsize=2048 -o mfsreadaheadleng=2048576 also gave no significant difference. Upgrading the nice level of mfsmount to 0 or even 2 also didn’t change performance. Trying to increase workers on chunkserver config didn’t change performance WORKERS_MAX = 500 WORKERS_MAX_IDLE = 80 Also tried reducing CHUNKS_LOOP_MIN_TIME = 150 on mfsmaster config but still no change. Throughout the tests I couldn’t see any cpu cores capping. Also Tried to run via a second network connection (same 100GB/10GB) without jumbo frames to rule out any issues on that side. Doing the mfsmount on the master server gave me pretty accurately the same read performance. Write was strangely doubled to 1,5GB/s, which is interesting as it only has a 10G interface. Guesses: I’m pretty new to moosefs and still trying to wrap my head around it but to me, this seems like some cap I’m running against as it’s so steady and reproduceable. Shouldn’t be the cache, as ram speed cap wouldn’t make sense. Shouldn’t be the ssd array. Shouldn’t be cpu as thereads are far from capping. The increased write on the master server indicates that it could be the latency between the two servers. Read is similar on both machines as they need to communicate either way. Write is increased on master beyond his nic capabilities, possibly because he’s only committing the writes to himself as fio deletes the data before it is even sent outside. Array idling during all tests is backing this theory. That said, ping is between 0.3 and 0.2 ms. Sorry for the long post I hope it’s still readable. Would be great if anyone could point me the way to understand the bottleneck(s) I’m facing and how to overcome it. Could latency be the right path? Thanks! Best Jacob |