From: WK <wk...@bn...> - 2018-05-21 17:05:19
|
On 5/21/2018 12:45 AM, Gandalf Corvotempesta wrote: >> Early on in our MFS history, we *did* have issues with VMs when they >> were under heavy i/o load AND the MFS cluster was busy doing >> rebalancing. They would go read-only and/or lose a chunk, requiring an >> fsck to recover. > Did you have time to figure this out? Why this happened? > It was due to a bug in the older version of MooseFS or something else? No, We assumed that the MFS cluster was so busy that all of the chunks weren't in place yet and the VM got unhappy. We were unaware of the fsync option (actually learned about this week on the other thread about 4.x) We switched to Gluster for the VMs and have been happy with that. > >> At the time we stopped using MFS for VM images and purposed MFS solely >> for Email, NAS type File Server loads and Archive backups. > Email will be one of our primary use case. Works great for Maildirs, just buy lots of disks as 3 copies plus the small file penalty really shows up!. Maybe the EC will cut that down. If your storage is a db then thas less of an issue. You could also create a vm disk and put the maildirs in there, to avoid the small file issue. We do that for email archive storage, but you take another performance hit. > >> Since 3.x we have begun to resume using MFS for "some" VM images with no >> failures, but we are still a little skittish and reserve that for >> 'Cloud-Native' installs where there are other VM copies on other >> hosts/storage, just in case something bad happens. > Why ? Did you have any more issues with 3.x ? No, the VMs on 3.x are actually fine. I recall seeing a note from the devs that they fixed something about VMs in that tree. We are just paranoid, having burned our fingers on 1.6.x >> We have experienced all sorts of disasters, crashes, bad drives, etc and >> were always able to recover using a metalogger or other backups with no >> data loss (expect on the fly data). > I'm really interested in this. > How MFS react to disk failures (or disk still working but with some URE) ? > Is it safe to use MFS without any RAID, as suggested in the official site ? We have goal=3, we run on plain XFS drives (no raid) on the chunkservers, which are actually really old kit. One a drive or chunkserver dies, the clients never see the issue. We replace the broken component and it heals. If you are really observant, you may notice a slight speed decrease during a massive rebalance at the client side, but only people using things like Mutt IMAP (which doesn't cache), really notice it. > >> a) a Tech brought up a chunkserver with the same IP as another >> chunkserver. Not a good result, as it swiss cheesed the chunkserver data >> on any file that was active during the period. > What happens in this case ? MFS will start to send bad data coming from the > chunkserver with the bad IP ? and write to it. I don't know if the newer versions fixed that, but it was ugly. and again it was OUR fault. > >> Note: The imap email storage is a funny use case. It works really well, >> but it really balloons storage space because of the small files. Plan >> for as much a 5x-7x needed capacity. > Why? 64MB chunks should be useless in email hosting. > If a file is smaller than 64MB, chunk will get the real file size. > Why we should plan for 5x-75x needed capacity ? well first you have goals, so that is 3x. Then you have a minimum of 64K at the moose level. so that 2k file is 64K on MFS. -wk |