From: Laurent W. <lw...@hy...> - 2010-06-14 08:45:52
|
On Thu, 10 Jun 2010 10:26:46 +0200 Michał Borychowski <mic...@ge...> wrote: > > > El Jue 03 Junio 2010, Laurent Wandrebeck escribió: > > > On Thu, 3 Jun 2010 09:02:29 +0200 > > > - Do you know of any « big user » relying on mfs ? I've been able to find > > > several for glusterfs for example, nothing for moosefs. Such entries would > > > be nice on the website, and reassuring for potential users. > > > > Well, I was pretty sure I saw a "Who's using" section on the website but I > > can't find it. Indeed it would be nice to have one. > [MB] No, it has not been yet created. We plan to implement it. Nice. > > [MB] At our company (http://www.gemius.com) we have four deployments, the biggest has almost 30 million files distributed over 70 chunk servers having a total space of 570TiB. Chunkserver machines at the same time are used to make other calculations. Do the chunkservers use the mfs volume they export via a local mount for their calculations ? > > [MB] Another big Polish company which uses MooseFS for data storage is Redefine (http://www.redefine.pl/). > > > > > > I've read that you have something like half a PB. We're up to 70TB, > > > going to 200 in the next months. Are there any known limits, bottlenecks, > > > loads that push systems/network on their knees ? We are processing satellite > > > images, so I/O is quite heavy, and I'm worrying a bit about the behaviour > > > during real processing load. > [MB] You can have a look at this FAQ entry: > http://www.moosefs.org/moosefs-faq.html#mtu Thanks for the link. I've read it before, I was just wondering if there were any other recipe :) > > [MB] At our environment we use SATA disks and while making lots of additional calculations on chunkservers we even do not fully use the available bandwidth of the network. If you will use SAS disks it can happen that there would appear some problems we have not yet encountered. We're 3ware+SATA everywhere here. So I guess it'll work. > > > > [ ... snip ... ] > > > master failover is a bit tricky, which is really annoying for HA. > > > > That's probably a point for Gluster as it doesn't have a metadata server, but > > actually there is a master (sort of) which is the one the clients connect to. > > > > If it goes away, there's a delay till another node becomes master, at least in > > theory as I didn't test that part. > [MB] You can also refer to this mini how-to: > http://www.moosefs.org/mini-howtos.html#redundant-master > > and see how it is possible to create a fail proof solution using CARP. Well, the only CARP setting I've done is for pfsense, and it's integrated (as in click, click, done:). MooseFS is especially sweet to configure and deploy. Not so for master failover :) Do you plan to enhance that point in an upcoming version so it becomes quick (and easy) to get ? It'd be a really nice feature, and could push MooseFS into HA world. > > > > [ ... snip ... ] > > Bigger files are divided into fragments of 64MB and each of them can be stored on different chunkservers. So there is a quite substantial probability that a big file with goal=1 will be unavailable (or at least its part(s)) if one of the chunks has been stored on the failed chunkserver. > > The general rule is to use goal=2 for normal files and goal=3 for files that are especially important to you. Thanks for the clarification. Another point, I've « advertised » my rpm repo on CentOS mailing-list. Someone asked if MooseFS was backed by a company or if it was developped on spare time by freelance devs. I know it was developped by Gemius for their internal needs, but now it's been freed, does the company still backs the software, or does devs work on it in their spare time ? Thanks, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |