From: Schmurfy <sch...@gm...> - 2012-04-19 14:24:14
|
Hi, I did some tests with moosefs on vms and "standard" machines (and I really love his project !) but now I need to decide on some rackable hardware to install in a datacenter and that's where things become annoying since I have no idea what to choose :( I was thinking of starting with two nodes one of them being the master as well as a chunk server and the other will be a backup master and a chunk server too so I suppose both servers will need a decent amount of memory and a not too slow cpu but they don't require some high end muti-core processor and of course they will need some disks. I think we will start with less than 10Tb on the cluster and replication goals set to 2. We are currently using some ProLiant DL160 G6 which has 4 cores running at 2.0GHz but judging by the theoretical needs of our moosefs machine I think these servers are way too powerful for this usage. Does anyone can give me some advices on what would be a good start ? Julien Ammous |
From: Andreas H. <ah...@it...> - 2012-04-20 10:10:44
|
Schmurfy <sch...@gm...> writes: > We are currently using some ProLiant DL160 G6 which has 4 cores running at > 2.0GHz but judging by the theoretical needs of our moosefs machine I think > these servers > are way too powerful for this usage. We use the same machines at 3 GHz (Intel Xeon X3470 @ 2.93GHz, 16 GByte RAM). You are right, CPU usage is barely noticeable. To make better use of resources we put some virtual machines (Webserver, Kerberos, AFS-DB) also on those nodes. Best regards Andreas -- Andreas Hirczy <ah...@it...> http://itp.tugraz.at/~ahi/ Graz University of Technology phone: +43/316/873- 8190 Institute of Theoretical and Computational Physics fax: +43/316/873-10 8190 Petersgasse 16, A-8010 Graz mobile: +43/664/859 23 57 |
From: Wang J. <jia...@re...> - 2012-04-20 10:45:28
|
于 2012/4/19 22:23, Schmurfy 写道: > Hi, > > I did some tests with moosefs on vms and "standard" machines (and I really love his project !) but now I need to decide on some rackable hardware to install in a datacenter and that's where things > become > annoying since I have no idea what to choose :( > > I was thinking of starting with two nodes one of them being the master as well as a chunk server and the other will be a backup master and a chunk server too so I suppose both > servers will need a decent amount of memory and a not too slow cpu but they don't require some high end muti-core processor and of course they will need some disks. > > I think we will start with less than 10Tb on the cluster and replication goals set to 2. > > We are currently using some ProLiant DL160 G6 which has 4 cores running at 2.0GHz but judging by the theoretical needs of our moosefs machine I think these servers > are way too powerful for this usage. > > > Does anyone can give me some advices on what would be a good start ? > > Dell C5220 in C5000 chassis? Downside is you can only use 2.5' hdd/ssd on this setup. But with 4 hdd/ssd slot per node by 8 nodes, it can be 32TB in capacity already at good performance to price ratio, and you can save a lot in hosting. |
From: Chris P. <ch...@na...> - 2012-04-20 11:28:06
|
I am in the same situation as OP - I have a cluster of 6 machines (5 hdd each) running in test, and am thinking about what I would need in production. I was looking at supermicro chassis and found the following chassis types which seem to offer highest density: 1u: 10x 2.5" http://www.supermicro.com/products/chassis/1U/?chs=116 2u: 24x 2.5" http://www.supermicro.com/products/chassis/2U/?chs=216 2u: 12x 3.5" http://www.supermicro.com/products/chassis/2U/?chs=827 4u: 88x 2.5" http://www.supermicro.com/products/chassis/4U/417/SC417E26-R1400U.cfm Does anyone have feedback on supermicro/these chassis? Chris On Fri, 2012-04-20 at 18:45 +0800, Wang Jian wrote: > 于 2012/4/19 22:23, Schmurfy 写道: > > Hi, > > > > I did some tests with moosefs on vms and "standard" machines (and I really love his project !) but now I need to decide on some rackable hardware to install in a datacenter and that's where things > > become > > annoying since I have no idea what to choose :( > > > > I was thinking of starting with two nodes one of them being the master as well as a chunk server and the other will be a backup master and a chunk server too so I suppose both > > servers will need a decent amount of memory and a not too slow cpu but they don't require some high end muti-core processor and of course they will need some disks. > > > > I think we will start with less than 10Tb on the cluster and replication goals set to 2. > > > > We are currently using some ProLiant DL160 G6 which has 4 cores running at 2.0GHz but judging by the theoretical needs of our moosefs machine I think these servers > > are way too powerful for this usage. > > > > > > Does anyone can give me some advices on what would be a good start ? > > > > > > Dell C5220 in C5000 chassis? > > Downside is you can only use 2.5' hdd/ssd on this setup. But with 4 hdd/ssd slot per node by 8 nodes, it can be 32TB in capacity already at good performance to price ratio, and you can save a lot in > hosting. > > ------------------------------------------------------------------------------ > For Developers, A Lot Can Happen In A Second. > Boundary is the first to Know...and Tell You. > Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! > http://p.sf.net/sfu/Boundary-d2dvs2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- CHRIS PICTON Executive Manager: Systems Cell: +27 (0)79 721 8521 Email: ch...@na... Tel: +27 (0)10 590 0031 Fax: +27 (0)87 941 0813 Rosebank Terrace, 23 Sturdee Avenue, Rosebank www.nashua-ecn.com "Lowering the cost of doing business" |
From: Atom P. <ap...@di...> - 2012-04-20 15:58:15
|
On 04/20/2012 04:09 AM, Chris Picton wrote: > I was looking at supermicro chassis and found the following chassis > types which seem to offer highest density: > > 2u: 12x 3.5" http://www.supermicro.com/products/chassis/2U/?chs=827 > > Does anyone have feedback on supermicro/these chassis? I use a lot of SuperMicro kit here. It performs well, is very reliable, and at the right price. (I buy from http://www.siliconmechanics.com/) I have three of the above chassis and a couple older 8-bay systems in my cluster. Because Moose is so good at dealing with system failure but slow to re-balance chunks I would recommend several "smaller" capacity servers over a few very large ones. Even at 10TB per server it takes a very long time to re-balance when I add or remove a system from the cluster; I would avoid going over about 10TB per server. Less is more in this case. -- -- Perfection is just a word I use occasionally with mustard. --Atom Powers-- Director of IT DigiPen Institute of Technology +1 (425) 895-4443 |
From: Quenten G. <QG...@on...> - 2012-04-21 04:19:13
|
Hi All, I've been thinking about this myself quit a bit in the last few months. MooseFS is very simular to Apache HDFS, which uses 128mb chunks instead of 64mb and a single metadata server with metaloggers etc. I mention this is because I've been investigating how the likes of Google, Yahoo etc have been setting up storage and compute clusters. What I've found is very interesting, For example yahoo use 4 x hard disks and 2 x Quad Core CPU's and 2 x 1GbE per node. They have up to 3500 nodes per cluster. Which I think is very interesting way of truly distributing there workload. Why 2x Quad CPU? Well they also use MapReduce (which is basically a distributed compute platform, think of "seti" or "folding at home" projects) So what I've basically found is certainly is "less is more", and as MFS/HDFS is always "limited" to the write speed of a single disk per process which may sound slow to some, however at scale is pretty impressive distributed platform if you think about it, so you're limited to around 50-60mb/s write per disk and reads should be the speed of your replica levels (give or take a bit). So I've set on an idea of well why not commoditise the storage nodes further and basically build them "cheap as possible" without sacrificing to much in the way of reliability e.g.: still use ECC memory or maybe we can build enough safe guards in MFS not even "require" ECC Memory in the storage nodes??? I think separating storage from compute has some significant benefits as well as combining the two so this one is always left up to the individual. But for the sake of what I'm trying to do here is separate the storage from compute in this example. Using the new Rack/Zone method you could build cheaper storage nodes with single power supply's and by using 2 x 1GBE instead of 10GBE or Infiniband you can save yourself some money without sacrificing reliability or performance so my idea was to use yahoo's example, and build 30 nodes with single Power Supply's around 4 or 8GB of RAM and 4 Hard Disks Per Node. For example if you have 20 nodes and 3 x replication and A & B power in your site you would only need to put 10 in Zone 1 and 10 in Zone 2 and set replica level of 3 and you'll always have access to your data. As long as your metadata servers have dual power supply's & ECC memory you should be perfect. Using this method we maybe able to use something like a low power Mini ITX with ECC memory and Integrated CPU ideally with a built in Software KVM/Monitoring Access simular to the Supermicro's motherboards. So what do you all think of this?? I always welcome any input =) Regards, Quenten Grasso -----Original Message----- From: Atom Powers [mailto:ap...@di...] Sent: Saturday, 21 April 2012 1:58 AM To: moo...@li... Subject: Re: [Moosefs-users] Hardware choice On 04/20/2012 04:09 AM, Chris Picton wrote: > I was looking at supermicro chassis and found the following chassis > types which seem to offer highest density: > > 2u: 12x 3.5" http://www.supermicro.com/products/chassis/2U/?chs=827 > > Does anyone have feedback on supermicro/these chassis? I use a lot of SuperMicro kit here. It performs well, is very reliable, and at the right price. (I buy from http://www.siliconmechanics.com/) I have three of the above chassis and a couple older 8-bay systems in my cluster. Because Moose is so good at dealing with system failure but slow to re-balance chunks I would recommend several "smaller" capacity servers over a few very large ones. Even at 10TB per server it takes a very long time to re-balance when I add or remove a system from the cluster; I would avoid going over about 10TB per server. Less is more in this case. -- -- Perfection is just a word I use occasionally with mustard. --Atom Powers-- Director of IT DigiPen Institute of Technology +1 (425) 895-4443 ------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Martins L. <mar...@vp...> - 2012-04-21 09:14:03
|
Nowadays ECC registered memory is cheaper than non-ECC desktop. You can get 2GB ECC REG modules for about $5-$7 ea on eBay depending on amount purchased. On 2012.04.21. 7:18, Quenten Grasso wrote: > Hi All, > > I've been thinking about this myself quit a bit in the last few months. > > MooseFS is very simular to Apache HDFS, which uses 128mb chunks instead of 64mb and a single metadata server with metaloggers etc. > > I mention this is because I've been investigating how the likes of Google, Yahoo etc have been setting up storage and compute clusters. > > What I've found is very interesting, For example yahoo use 4 x hard disks and 2 x Quad Core CPU's and 2 x 1GbE per node. > They have up to 3500 nodes per cluster. Which I think is very interesting way of truly distributing there workload. > > Why 2x Quad CPU? Well they also use MapReduce (which is basically a distributed compute platform, think of "seti" or "folding at home" projects) > > So what I've basically found is certainly is "less is more", and as MFS/HDFS is always "limited" to the write speed of a single disk per process which may sound slow to some, however at scale is pretty impressive distributed platform if you think about it, so you're limited to around 50-60mb/s write per disk and reads should be the speed of your replica levels (give or take a bit). > > So I've set on an idea of well why not commoditise the storage nodes further and basically build them "cheap as possible" without sacrificing to much in the way of reliability e.g.: still use ECC memory or maybe we can build enough safe guards in MFS not even "require" ECC Memory in the storage nodes??? > > I think separating storage from compute has some significant benefits as well as combining the two so this one is always left up to the individual. But for the sake of what I'm trying to do here is separate the storage from compute in this example. > > Using the new Rack/Zone method you could build cheaper storage nodes with single power supply's and by using 2 x 1GBE instead of 10GBE or Infiniband you can save yourself some money without sacrificing reliability or performance so my idea was to use yahoo's example, and build 30 nodes with single Power Supply's around 4 or 8GB of RAM and 4 Hard Disks Per Node. > > For example if you have 20 nodes and 3 x replication and A & B power in your site you would only need to put 10 in Zone 1 and 10 in Zone 2 and set replica level of 3 and you'll always have access to your data. As long as your metadata servers have dual power supply's & ECC memory you should be perfect. > > Using this method we maybe able to use something like a low power Mini ITX with ECC memory and Integrated CPU ideally with a built in Software KVM/Monitoring Access simular to the Supermicro's motherboards. > > So what do you all think of this?? I always welcome any input =) > > Regards, > Quenten Grasso > > > -----Original Message----- > From: Atom Powers [mailto:ap...@di...] > Sent: Saturday, 21 April 2012 1:58 AM > To: moo...@li... > Subject: Re: [Moosefs-users] Hardware choice > > On 04/20/2012 04:09 AM, Chris Picton wrote: >> I was looking at supermicro chassis and found the following chassis >> types which seem to offer highest density: >> >> 2u: 12x 3.5" http://www.supermicro.com/products/chassis/2U/?chs=827 >> >> Does anyone have feedback on supermicro/these chassis? > > I use a lot of SuperMicro kit here. It performs well, is very reliable, > and at the right price. (I buy from http://www.siliconmechanics.com/) > > I have three of the above chassis and a couple older 8-bay systems in my > cluster. Because Moose is so good at dealing with system failure but > slow to re-balance chunks I would recommend several "smaller" capacity > servers over a few very large ones. Even at 10TB per server it takes a > very long time to re-balance when I add or remove a system from the > cluster; I would avoid going over about 10TB per server. Less is more in > this case. > |
From: Quenten G. <QG...@on...> - 2012-04-21 10:47:58
|
Very true, ECC memory has dropped in price considerably over the years. Also I think the embedded CPUs really still aren't up to scratch in the way of performance which is what I forgot to mention before however I think building a system in a truly brick by brick (modular) maybe another way to look at a solution. Regards, Quenten Grasso -----Original Message----- From: Martins Lazdans [mailto:mar...@vp...] Sent: Saturday, 21 April 2012 6:55 PM To: moo...@li... Subject: Re: [Moosefs-users] Hardware choice Nowadays ECC registered memory is cheaper than non-ECC desktop. You can get 2GB ECC REG modules for about $5-$7 ea on eBay depending on amount purchased. On 2012.04.21. 7:18, Quenten Grasso wrote: > Hi All, > > I've been thinking about this myself quit a bit in the last few months. > > MooseFS is very simular to Apache HDFS, which uses 128mb chunks instead of 64mb and a single metadata server with metaloggers etc. > > I mention this is because I've been investigating how the likes of Google, Yahoo etc have been setting up storage and compute clusters. > > What I've found is very interesting, For example yahoo use 4 x hard disks and 2 x Quad Core CPU's and 2 x 1GbE per node. > They have up to 3500 nodes per cluster. Which I think is very interesting way of truly distributing there workload. > > Why 2x Quad CPU? Well they also use MapReduce (which is basically a distributed compute platform, think of "seti" or "folding at home" projects) > > So what I've basically found is certainly is "less is more", and as MFS/HDFS is always "limited" to the write speed of a single disk per process which may sound slow to some, however at scale is pretty impressive distributed platform if you think about it, so you're limited to around 50-60mb/s write per disk and reads should be the speed of your replica levels (give or take a bit). > > So I've set on an idea of well why not commoditise the storage nodes further and basically build them "cheap as possible" without sacrificing to much in the way of reliability e.g.: still use ECC memory or maybe we can build enough safe guards in MFS not even "require" ECC Memory in the storage nodes??? > > I think separating storage from compute has some significant benefits as well as combining the two so this one is always left up to the individual. But for the sake of what I'm trying to do here is separate the storage from compute in this example. > > Using the new Rack/Zone method you could build cheaper storage nodes with single power supply's and by using 2 x 1GBE instead of 10GBE or Infiniband you can save yourself some money without sacrificing reliability or performance so my idea was to use yahoo's example, and build 30 nodes with single Power Supply's around 4 or 8GB of RAM and 4 Hard Disks Per Node. > > For example if you have 20 nodes and 3 x replication and A & B power in your site you would only need to put 10 in Zone 1 and 10 in Zone 2 and set replica level of 3 and you'll always have access to your data. As long as your metadata servers have dual power supply's & ECC memory you should be perfect. > > Using this method we maybe able to use something like a low power Mini ITX with ECC memory and Integrated CPU ideally with a built in Software KVM/Monitoring Access simular to the Supermicro's motherboards. > > So what do you all think of this?? I always welcome any input =) > > Regards, > Quenten Grasso > > > -----Original Message----- > From: Atom Powers [mailto:ap...@di...] > Sent: Saturday, 21 April 2012 1:58 AM > To: moo...@li... > Subject: Re: [Moosefs-users] Hardware choice > > On 04/20/2012 04:09 AM, Chris Picton wrote: >> I was looking at supermicro chassis and found the following chassis >> types which seem to offer highest density: >> >> 2u: 12x 3.5" http://www.supermicro.com/products/chassis/2U/?chs=827 >> >> Does anyone have feedback on supermicro/these chassis? > > I use a lot of SuperMicro kit here. It performs well, is very reliable, > and at the right price. (I buy from http://www.siliconmechanics.com/) > > I have three of the above chassis and a couple older 8-bay systems in my > cluster. Because Moose is so good at dealing with system failure but > slow to re-balance chunks I would recommend several "smaller" capacity > servers over a few very large ones. Even at 10TB per server it takes a > very long time to re-balance when I add or remove a system from the > cluster; I would avoid going over about 10TB per server. Less is more in > this case. > ------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Steve T. <sm...@cb...> - 2012-04-21 11:09:33
|
On Fri, 20 Apr 2012, Atom Powers wrote: > Because Moose is so good at dealing with system failure but slow to > re-balance chunks I would recommend several "smaller" capacity servers > over a few very large ones. Even at 10TB per server it takes a very long > time to re-balance when I add or remove a system from the cluster; I > would avoid going over about 10TB per server. Less is more in this case. I have to agree with this. I have two chunkservers with 25TB storage, as well as several smaller chunkservers, and I recently removed 20TB of disk from one of them (about 70% full). It took a little over 10 days to replicate the removed chunks. Steve |
From: Allen, B. S <bs...@la...> - 2012-04-23 19:42:06
|
I agree with this approach as well for most general purpose deployments. Unless you're trying to optimize for single stream performance. In which case I'd suggest going back to large storage node approach, with RAID sets instead of showing MFS each spindle. For example I'm using a handful of SuperMicro's 36 drive chassis, 10GbE, with Illumos based OS, and ZFS RAIDZ2 sets. Ending up with 43TB / node. I present MFS with a single HD that is comprised of 4x 8 Drive RAIDZ2 per node. It's going to be extremely painful when I need to migrate data off a node, but I get good single stream performance. I could likely better balance single stream performance with time to migrate by going to 16 drive chassis or similar, however overall cost / TB would increase as you're now buying more CPUs, etc. Although you could drop down to a single socket, less RAM per node, and so on to displace this extra cost. Like everything it's a trade off. Ben On Apr 21, 2012, at 5:09 AM, Steve Thompson wrote: > On Fri, 20 Apr 2012, Atom Powers wrote: > >> Because Moose is so good at dealing with system failure but slow to >> re-balance chunks I would recommend several "smaller" capacity servers >> over a few very large ones. Even at 10TB per server it takes a very long >> time to re-balance when I add or remove a system from the cluster; I >> would avoid going over about 10TB per server. Less is more in this case. > > I have to agree with this. I have two chunkservers with 25TB storage, as > well as several smaller chunkservers, and I recently removed 20TB of disk > from one of them (about 70% full). It took a little over 10 days to > replicate the removed chunks. > > Steve > > ------------------------------------------------------------------------------ > For Developers, A Lot Can Happen In A Second. > Boundary is the first to Know...and Tell You. > Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! > http://p.sf.net/sfu/Boundary-d2dvs2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |