|
From: WK <wk...@bn...> - 2014-10-28 00:06:53
|
On 10/27/2014 2:28 PM, Krzysztof Kielak wrote: > Dear WK, > > Could you please provide some more details? Which version of MooseFS you were testing? On which Hypervisor? How many chunkserver? How many spindles etc? 1.6.20 at the time. About 4-5 CS and a goal of 3. Each CS had 2-4 drives. We have a Mix of KVM and Oracles Virtbox (Vagrant). > > You could control the acceptable replication overhead within MooseFS starting from versions before 1.6 but current version 2.0 has a lot of improvements for replication configuration and chunkserver rebalancing in case of failure. Yes, we could slow down the replication speed in later versions, but you would still get stress if it was doing several things at once. For example, you lost a chunkserver and replaced it at the same time you were adding Hard Drives to other CS and/or a particular VM got pounded with Disk I/O and/or a tech decided to upload a new VM image without rate limiting the rsync. Again things generally ran great with reasonable performance, but every once in a while you could get these multiple incident scenarios and then you were asking for the VMs to go read only. We did play with different elevators and that seemed to help (I can't recall which worked best) and we did tune the OS's timeout within the VM, so they weren't so quick to go VM, but if you did that too aggressively, you could end up with a corrupt image due to the aggressive MFS caching. To be fair it wasn't the greatest kit either with older SATA/SCSI Drives, since it was a dev bench situation. We also were constantly swapping in and out drives within CS as kit was freed up from production upgrades. Note that is what is so cool about MFS, you can play around with the Chunkservers and it just adapts and auto-magically does the right thing. Kind of like an open source DROBO. Other observations, The smaller the VM the less likely you were going to have a problem (i.e. under 2-3 GB almost never even under load, as you got closer to 50GB image, the chances of a RO increased). We simply came to the conclusion that since the VM image was composed of numerous MFS chunks, that EXT3/4 didn't like having the rug pulled out from under it. You could predict that something was going to go RO by looking at /var/log/messages. As those retry errors began to quickly scroll up the screen (normally not a big deal), you knew what was coming. Perhaps if we have time, we may play with the newer versions, I was just reporting our observations. Clearly with newer software, better kit, less goals and SSDs you could probably mitigate the problem even with the older software, but we were concerned that there WAS a point where you could get in trouble even though that situation was clearly more remote. |