From: Quenten G. <QG...@on...> - 2012-04-21 10:47:58
|
Very true, ECC memory has dropped in price considerably over the years. Also I think the embedded CPUs really still aren't up to scratch in the way of performance which is what I forgot to mention before however I think building a system in a truly brick by brick (modular) maybe another way to look at a solution. Regards, Quenten Grasso -----Original Message----- From: Martins Lazdans [mailto:mar...@vp...] Sent: Saturday, 21 April 2012 6:55 PM To: moo...@li... Subject: Re: [Moosefs-users] Hardware choice Nowadays ECC registered memory is cheaper than non-ECC desktop. You can get 2GB ECC REG modules for about $5-$7 ea on eBay depending on amount purchased. On 2012.04.21. 7:18, Quenten Grasso wrote: > Hi All, > > I've been thinking about this myself quit a bit in the last few months. > > MooseFS is very simular to Apache HDFS, which uses 128mb chunks instead of 64mb and a single metadata server with metaloggers etc. > > I mention this is because I've been investigating how the likes of Google, Yahoo etc have been setting up storage and compute clusters. > > What I've found is very interesting, For example yahoo use 4 x hard disks and 2 x Quad Core CPU's and 2 x 1GbE per node. > They have up to 3500 nodes per cluster. Which I think is very interesting way of truly distributing there workload. > > Why 2x Quad CPU? Well they also use MapReduce (which is basically a distributed compute platform, think of "seti" or "folding at home" projects) > > So what I've basically found is certainly is "less is more", and as MFS/HDFS is always "limited" to the write speed of a single disk per process which may sound slow to some, however at scale is pretty impressive distributed platform if you think about it, so you're limited to around 50-60mb/s write per disk and reads should be the speed of your replica levels (give or take a bit). > > So I've set on an idea of well why not commoditise the storage nodes further and basically build them "cheap as possible" without sacrificing to much in the way of reliability e.g.: still use ECC memory or maybe we can build enough safe guards in MFS not even "require" ECC Memory in the storage nodes??? > > I think separating storage from compute has some significant benefits as well as combining the two so this one is always left up to the individual. But for the sake of what I'm trying to do here is separate the storage from compute in this example. > > Using the new Rack/Zone method you could build cheaper storage nodes with single power supply's and by using 2 x 1GBE instead of 10GBE or Infiniband you can save yourself some money without sacrificing reliability or performance so my idea was to use yahoo's example, and build 30 nodes with single Power Supply's around 4 or 8GB of RAM and 4 Hard Disks Per Node. > > For example if you have 20 nodes and 3 x replication and A & B power in your site you would only need to put 10 in Zone 1 and 10 in Zone 2 and set replica level of 3 and you'll always have access to your data. As long as your metadata servers have dual power supply's & ECC memory you should be perfect. > > Using this method we maybe able to use something like a low power Mini ITX with ECC memory and Integrated CPU ideally with a built in Software KVM/Monitoring Access simular to the Supermicro's motherboards. > > So what do you all think of this?? I always welcome any input =) > > Regards, > Quenten Grasso > > > -----Original Message----- > From: Atom Powers [mailto:ap...@di...] > Sent: Saturday, 21 April 2012 1:58 AM > To: moo...@li... > Subject: Re: [Moosefs-users] Hardware choice > > On 04/20/2012 04:09 AM, Chris Picton wrote: >> I was looking at supermicro chassis and found the following chassis >> types which seem to offer highest density: >> >> 2u: 12x 3.5" http://www.supermicro.com/products/chassis/2U/?chs=827 >> >> Does anyone have feedback on supermicro/these chassis? > > I use a lot of SuperMicro kit here. It performs well, is very reliable, > and at the right price. (I buy from http://www.siliconmechanics.com/) > > I have three of the above chassis and a couple older 8-bay systems in my > cluster. Because Moose is so good at dealing with system failure but > slow to re-balance chunks I would recommend several "smaller" capacity > servers over a few very large ones. Even at 10TB per server it takes a > very long time to re-balance when I add or remove a system from the > cluster; I would avoid going over about 10TB per server. Less is more in > this case. > ------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |