From: Steve T. <sm...@cb...> - 2012-04-03 19:56:09
|
OK, so now you have a nice and shiny and absolutely massive MooseFS file system. How do you back it up? I am using Bacula and divide the MFS file system into separate areas (eg directories beginning with a, those beginning with b, and so on) and use several different chunkservers to run the backup jobs, on the theory that at least some of the data is local to the backup process. But this still leaves the vast majority of data to travel the network twice (a planned dedicated storage network has not yet been implemented). This results in pretty bad backup performance and high network load. Any clever ideas? Steve -- ---------------------------------------------------------------------------- Steve Thompson, Cornell School of Chemical and Biomolecular Engineering smt AT cbe DOT cornell DOT edu "186,282 miles per second: it's not just a good idea, it's the law" ---------------------------------------------------------------------------- |
From: Atom P. <ap...@di...> - 2012-04-03 20:19:03
|
I've been thinking about this for a while and I think occam's razor (the simplest ideas is the best) might provide some guidance. MooseFS is fault-tolerant; so you can mitigate "hardware failure". MooseFS provides a trash space, so you can mitigate "accidental deletion" events. MooseFS provides snapshots, so you can mitigate "corruption" events. The remaining scenario, "somebody stashes a nuclear warhead in the locker room", requires off-site backup. If "rack awareness" was able to guarantee chucks in multiple locations, then that would mitigate this event. Since it can't I'm going to be sending data off-site using a large LTO5 tape library managed by Bacula on a server that also runs mfsmount of the entire system. On 04/03/2012 12:56 PM, Steve Thompson wrote: > OK, so now you have a nice and shiny and absolutely massive MooseFS file > system. How do you back it up? > > I am using Bacula and divide the MFS file system into separate areas (eg > directories beginning with a, those beginning with b, and so on) and use > several different chunkservers to run the backup jobs, on the theory that > at least some of the data is local to the backup process. But this still > leaves the vast majority of data to travel the network twice (a planned > dedicated storage network has not yet been implemented). This results in > pretty bad backup performance and high network load. Any clever ideas? > > Steve -- -- Perfection is just a word I use occasionally with mustard. --Atom Powers-- Director of IT DigiPen Institute of Technology +1 (425) 895-4443 |
From: Allen, B. S <bs...@la...> - 2012-04-03 21:16:57
|
Similar plan here. I have a dedicated server for MFS backup purposes. We're using IBM's Tivoli to push to a large GPFS archive system backed with a SpectraLogic tape library. I have the standard Linux Tivoli client running on this host. One key with Tivoli is to use the DiskCacheMethod, and set the disk cache to be somewhere on local disk instead of the root of the mfs mount. Also I backup mfsmaster's files every hour and retain at least a week of these backups. From the various horror stories we've heard on this mailing list, all have been from corrupt metadata files from mfsmaster. It's a really good idea to limit your exposure to this. For good measure I also backup metalogger's files every night. One dream for backup of MFS is to somehow utilize the metadata files dumped by mfsmaster or metalogger, to be able to do a metadata "diff". The goal of this process would be to produce a list of all objects in the filesystem that have changed between two metadata.mfs.back files. Thus you could feed your backup client a list of files, without having the need for the client to inspect the filesystem itself. This idea is inspired by ZFS' diff functionality. Where ZFS can show the changes between a snapshot and the live filesystem. Ben On Apr 3, 2012, at 2:18 PM, Atom Powers wrote: > I've been thinking about this for a while and I think occam's razor (the > simplest ideas is the best) might provide some guidance. > > MooseFS is fault-tolerant; so you can mitigate "hardware failure". > MooseFS provides a trash space, so you can mitigate "accidental > deletion" events. > MooseFS provides snapshots, so you can mitigate "corruption" events. > > The remaining scenario, "somebody stashes a nuclear warhead in the > locker room", requires off-site backup. If "rack awareness" was able to > guarantee chucks in multiple locations, then that would mitigate this > event. Since it can't I'm going to be sending data off-site using a > large LTO5 tape library managed by Bacula on a server that also runs > mfsmount of the entire system. > > On 04/03/2012 12:56 PM, Steve Thompson wrote: >> OK, so now you have a nice and shiny and absolutely massive MooseFS file >> system. How do you back it up? >> >> I am using Bacula and divide the MFS file system into separate areas (eg >> directories beginning with a, those beginning with b, and so on) and use >> several different chunkservers to run the backup jobs, on the theory that >> at least some of the data is local to the backup process. But this still >> leaves the vast majority of data to travel the network twice (a planned >> dedicated storage network has not yet been implemented). This results in >> pretty bad backup performance and high network load. Any clever ideas? >> >> Steve > > -- > -- > Perfection is just a word I use occasionally with mustard. > --Atom Powers-- > Director of IT > DigiPen Institute of Technology > +1 (425) 895-4443 > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Quenten G. <QG...@on...> - 2012-04-03 21:36:55
|
Hi All, How large is your metadata & logs at this stage? Just trying to mitigate this exact issue myself. I was planning to create hourly snapshots (as I understand the way they are implemented they don't affect performance unlike a vmware snapshot please correct me if I'm wrong) and copy these offsite to another mfs/cluster using rsync w/ snapshots on the other site with maybe a goal of 2 at most and using a goal of 3 on site. I guess the big issue here is storing our data 5 times in total vs. tapes however I guess it would be "quicker" to recover from a "failure" having a running cluster on site b vs a tape backup and dare i say it (possibly) more reliable then a singular tape and tape library. Also I've been tossing up the idea of using ZFS for storage, reason I say this is because I know mfs has built in check-summing/aka zfs and all that good stuff, however having to store our data 3 times + 2 times is expensive maybe storing it 2+1 instead would work out at scale by using the likes of ZFS for reliability then using mfs for purely for availability instead of reliability & availability as well... Would be great if there was away to use some kind of rack awareness to say at all times keep goal of 1 or 2 of the data offsite on our 2nd mfs cluster. When I was speaking to one of the staff of the mfs support team they mentioned this was kind of being developed for another customer, So we may see some kind of solution? Quenten -----Original Message----- From: Allen, Benjamin S [mailto:bs...@la...] Sent: Wednesday, 4 April 2012 7:17 AM To: moo...@li... Subject: Re: [Moosefs-users] Backup strategies Similar plan here. I have a dedicated server for MFS backup purposes. We're using IBM's Tivoli to push to a large GPFS archive system backed with a SpectraLogic tape library. I have the standard Linux Tivoli client running on this host. One key with Tivoli is to use the DiskCacheMethod, and set the disk cache to be somewhere on local disk instead of the root of the mfs mount. Also I backup mfsmaster's files every hour and retain at least a week of these backups. From the various horror stories we've heard on this mailing list, all have been from corrupt metadata files from mfsmaster. It's a really good idea to limit your exposure to this. For good measure I also backup metalogger's files every night. One dream for backup of MFS is to somehow utilize the metadata files dumped by mfsmaster or metalogger, to be able to do a metadata "diff". The goal of this process would be to produce a list of all objects in the filesystem that have changed between two metadata.mfs.back files. Thus you could feed your backup client a list of files, without having the need for the client to inspect the filesystem itself. This idea is inspired by ZFS' diff functionality. Where ZFS can show the changes between a snapshot and the live filesystem. Ben On Apr 3, 2012, at 2:18 PM, Atom Powers wrote: > I've been thinking about this for a while and I think occam's razor (the > simplest ideas is the best) might provide some guidance. > > MooseFS is fault-tolerant; so you can mitigate "hardware failure". > MooseFS provides a trash space, so you can mitigate "accidental > deletion" events. > MooseFS provides snapshots, so you can mitigate "corruption" events. > > The remaining scenario, "somebody stashes a nuclear warhead in the > locker room", requires off-site backup. If "rack awareness" was able to > guarantee chucks in multiple locations, then that would mitigate this > event. Since it can't I'm going to be sending data off-site using a > large LTO5 tape library managed by Bacula on a server that also runs > mfsmount of the entire system. > > On 04/03/2012 12:56 PM, Steve Thompson wrote: >> OK, so now you have a nice and shiny and absolutely massive MooseFS file >> system. How do you back it up? >> >> I am using Bacula and divide the MFS file system into separate areas (eg >> directories beginning with a, those beginning with b, and so on) and use >> several different chunkservers to run the backup jobs, on the theory that >> at least some of the data is local to the backup process. But this still >> leaves the vast majority of data to travel the network twice (a planned >> dedicated storage network has not yet been implemented). This results in >> pretty bad backup performance and high network load. Any clever ideas? >> >> Steve > > -- > -- > Perfection is just a word I use occasionally with mustard. > --Atom Powers-- > Director of IT > DigiPen Institute of Technology > +1 (425) 895-4443 > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Atom P. <ap...@di...> - 2012-04-03 21:49:36
|
On 04/03/2012 02:36 PM, Quenten Grasso wrote: > I was planning to create hourly snapshots (as I understand the way > they are implemented they don't affect performance unlike a vmware > snapshot please correct me if I'm wrong) and copy these offsite to > another mfs/cluster using rsync w/ snapshots on the other site with > maybe a goal of 2 at most and using a goal of 3 on site. Although snapshots don't increase the amount of storage used by the system, it effectively doubles the amount of metadata. For even medium-sized systems, making a snapshot of the complete system may actually decrease the security of the system by introducing problems with the amount of RAM and disk used by the metamaster. On my system, with about 7 million files and 3GB of metadata, doing a daily snapshot for a week requires some 22GB+ of additional RAM in the metamaster and metalogger. In other words, just because you /can/ do snapshots doesn't means you can do them without careful capacity planning. (And based on the number of people having issues with their metamaster I am very hesitant to recommend that strategy.) -- -- Perfection is just a word I use occasionally with mustard. --Atom Powers-- Director of IT DigiPen Institute of Technology +1 (425) 895-4443 |
From: Allen, B. S <bs...@la...> - 2012-04-03 23:13:15
|
Quenten, I'm using MFS with ZFS. I use ZFS for RAIDZ2 (RAID6) and hot sparing on each chunkserver. I then only set a goal of 2 in MFS. I also have a "scratch" directory within MFS that is set to goal 1 and not backed up to tape. I attempt to get my users to organize their data between their data directory and scratch to minimize goal overhead for data that doesn't require it. Overhead of my particular ZFS setup is ~15% lost to parity and hot spares. Although I was a bit bold with my RAIDZ2 configuration, which will cause rebuild time to be quite long in trade off for lower overhead. This was done with the knowledge that RAIDZ2 can withstand two drive failures, and MFS would have another copy of the data on another chunk server. I have not however tested how well MFS handles a ZFS pool degraded with data loss. I'm guessing I would take the chunkserver daemon offline, get the ZFS pool into a rebuilding state, and restart the CS. I'm guessing the CS will see missing chunks, mark them undergoal, and re-replicate them. A more cautious RAID set would be closer to 30% overhead. Then of course with goal 2 you lose another 50%. A side benefit of using ZFS is on-the-fly compression and de-dup of your chunkserver, L2ARC SSD read cache (although it turns out most of my cache hits are from L1ARC, i.e. memory), and to speed up writes you can add a pair of ZIL SSDs. For disaster recovery you always need to be extra careful when relying on a single system todo your live and DR sites. In this case you're asking for MFS to push data to another site. You'd then be relying on a single piece of software that could equally corrupt your live site and your DR site. Ben On Apr 3, 2012, at 3:36 PM, Quenten Grasso wrote: > Hi All, > > How large is your metadata & logs at this stage? Just trying to mitigate this exact issue myself. > > I was planning to create hourly snapshots (as I understand the way they are implemented they don't affect performance unlike a vmware snapshot please correct me if I'm wrong) and copy these offsite to another mfs/cluster using rsync w/ snapshots on the other site with maybe a goal of 2 at most and using a goal of 3 on site. > > I guess the big issue here is storing our data 5 times in total vs. tapes however I guess it would be "quicker" to recover from a "failure" having a running cluster on site b vs a tape backup and dare i say it (possibly) more reliable then a singular tape and tape library. > > Also I've been tossing up the idea of using ZFS for storage, reason I say this is because I know mfs has built in check-summing/aka zfs and all that good stuff, however having to store our data 3 times + 2 times is expensive maybe storing it 2+1 instead would work out at scale by using the likes of ZFS for reliability then using mfs for purely for availability instead of reliability & availability as well... > > Would be great if there was away to use some kind of rack awareness to say at all times keep goal of 1 or 2 of the data offsite on our 2nd mfs cluster. When I was speaking to one of the staff of the mfs support team they mentioned this was kind of being developed for another customer, So we may see some kind of solution? > > Quenten > > -----Original Message----- > From: Allen, Benjamin S [mailto:bs...@la...] > Sent: Wednesday, 4 April 2012 7:17 AM > To: moo...@li... > Subject: Re: [Moosefs-users] Backup strategies > > Similar plan here. > > I have a dedicated server for MFS backup purposes. We're using IBM's Tivoli to push to a large GPFS archive system backed with a SpectraLogic tape library. I have the standard Linux Tivoli client running on this host. One key with Tivoli is to use the DiskCacheMethod, and set the disk cache to be somewhere on local disk instead of the root of the mfs mount. > > Also I backup mfsmaster's files every hour and retain at least a week of these backups. From the various horror stories we've heard on this mailing list, all have been from corrupt metadata files from mfsmaster. It's a really good idea to limit your exposure to this. > > For good measure I also backup metalogger's files every night. > > One dream for backup of MFS is to somehow utilize the metadata files dumped by mfsmaster or metalogger, to be able to do a metadata "diff". The goal of this process would be to produce a list of all objects in the filesystem that have changed between two metadata.mfs.back files. Thus you could feed your backup client a list of files, without having the need for the client to inspect the filesystem itself. This idea is inspired by ZFS' diff functionality. Where ZFS can show the changes between a snapshot and the live filesystem. > > Ben > > On Apr 3, 2012, at 2:18 PM, Atom Powers wrote: > >> I've been thinking about this for a while and I think occam's razor (the >> simplest ideas is the best) might provide some guidance. >> >> MooseFS is fault-tolerant; so you can mitigate "hardware failure". >> MooseFS provides a trash space, so you can mitigate "accidental >> deletion" events. >> MooseFS provides snapshots, so you can mitigate "corruption" events. >> >> The remaining scenario, "somebody stashes a nuclear warhead in the >> locker room", requires off-site backup. If "rack awareness" was able to >> guarantee chucks in multiple locations, then that would mitigate this >> event. Since it can't I'm going to be sending data off-site using a >> large LTO5 tape library managed by Bacula on a server that also runs >> mfsmount of the entire system. >> >> On 04/03/2012 12:56 PM, Steve Thompson wrote: >>> OK, so now you have a nice and shiny and absolutely massive MooseFS file >>> system. How do you back it up? >>> >>> I am using Bacula and divide the MFS file system into separate areas (eg >>> directories beginning with a, those beginning with b, and so on) and use >>> several different chunkservers to run the backup jobs, on the theory that >>> at least some of the data is local to the backup process. But this still >>> leaves the vast majority of data to travel the network twice (a planned >>> dedicated storage network has not yet been implemented). This results in >>> pretty bad backup performance and high network load. Any clever ideas? >>> >>> Steve >> >> -- >> -- >> Perfection is just a word I use occasionally with mustard. >> --Atom Powers-- >> Director of IT >> DigiPen Institute of Technology >> +1 (425) 895-4443 >> >> ------------------------------------------------------------------------------ >> Better than sec? Nothing is better than sec when it comes to >> monitoring Big Data applications. Try Boundary one-second >> resolution app monitoring today. Free. >> http://p.sf.net/sfu/Boundary-dev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Quenten G. <QG...@on...> - 2012-04-04 07:34:42
|
Hi Ben, Moose Users Thanks for your reply I've been thinking about using ZFS however as I understand some of the benefits of ZFS which are worth leveraging are data corruption prevention aka check summing of data via scrubs and compression. As I understand MFS for awhile now has check summing built into it? From the MFS fuse mount (across the network) all the way down to the disk level so whenever we access data its check summed which in it self is great, So this means that we don't "need" to use raid controllers for data protection and if we use a goal of 2 or more, so we are getting redundancy and data protection for little extra space. I've done some basic math on using ZFS, for example 4 chunk servers with 8 drives using 2TB drives. Using ZFS raidz2 with a goal of 2 vs single disks and a goal of 3. ZFS/GOAL2=24 Useable TB GOAL3=21 Useable TB So clearly there is a space saving here of around 3 TB using ZFS... Reliability, So on one hand with ZFS configuration anymore then 1 physical server or 5 disks fail at the same time within being replaced in the same time with in 2 chassis and our cluster is offline. VS GOAL A goal of 3 anymore then 2 disks with the same data set across the total number of servers or 2 physical servers fail at any one time and our cluster is effectively offline also keeping in mind, the chances of this happening would have to be pretty low as you increase your number of servers and drives.. SPEED... Raw speeds of a single SATA Disk around 75 IOPS and around 100mb/s throughput. Reliability, A RAIDZ2 I imagine we would achieve the speed of the 6 of 8 disks being 450 IOPS or 600mb/s per server GOAL With a Goal of 3, we would achieve a write of 75 IOPS & ~100mb/s per server. Single threads I think the ZFS system should be certainly faster throughput; however multiple threads the multiple paths in and out the goal of 3 I think would win. At this stage, it always seems like a trade off; of either reliability or performance pick one? So reviewing these examples the middle solution would be utilizing RaidZ1 with goal of 3, this would be the closest we could get to with performance and redundancy... This change's again when we look at scale, So now let's expand our servers to 40-80 servers using Raidz2, having 40 servers with 1 single volume and a goal of 3. Which 2 of the 40 servers could fail at anyone time and I wouldn't lose access to any data?, the chunks are effectively "randomly" placed among the cluster so I guess we would need to increase the over all goal by increasing utilized space usage once again for reliability..... non-raid/zfs setup for 40 servers/320 Hard Disks, 3 of which has my data on it, which 2 can fail without me losing access to my data :) So I guess this raises a few more questions which solution is the most effective... In the case of using ZFS raidz2/1 solutions What becomes the acceptable ratio of servers to GOAL from a reliability point of view or using individual disks/GOAL scaling the amount of servers would give us an increase of performance at the cost of reliability? Also from a performance point the higher the goal the more throughput however this may work against us if the cluster is "very busy" across all of the servers So I guess we are back to where we started we still have to pick one, Performance or Reliability? So any thoughts? Also thanks for reading, if you made it :) Regards, Quenten Grasso -----Original Message----- From: Allen, Benjamin S [mailto:bs...@la...] Sent: Wednesday, 4 April 2012 9:13 AM To: Quenten Grasso Cc: moo...@li... Subject: Re: [Moosefs-users] Backup strategies Quenten, I'm using MFS with ZFS. I use ZFS for RAIDZ2 (RAID6) and hot sparing on each chunkserver. I then only set a goal of 2 in MFS. I also have a "scratch" directory within MFS that is set to goal 1 and not backed up to tape. I attempt to get my users to organize their data between their data directory and scratch to minimize goal overhead for data that doesn't require it. Overhead of my particular ZFS setup is ~15% lost to parity and hot spares. Although I was a bit bold with my RAIDZ2 configuration, which will cause rebuild time to be quite long in trade off for lower overhead. This was done with the knowledge that RAIDZ2 can withstand two drive failures, and MFS would have another copy of the data on another chunk server. I have not however tested how well MFS handles a ZFS pool degraded with data loss. I'm guessing I would take the chunkserver daemon offline, get the ZFS pool into a rebuilding state, and restart the CS. I'm guessing the CS will see missing chunks, mark them undergoal, and re-replicate them. A more cautious RAID set would be closer to 30% overhead. Then of course with goal 2 you lose another 50%. A side benefit of using ZFS is on-the-fly compression and de-dup of your chunkserver, L2ARC SSD read cache (although it turns out most of my cache hits are from L1ARC, i.e. memory), and to speed up writes you can add a pair of ZIL SSDs. For disaster recovery you always need to be extra careful when relying on a single system todo your live and DR sites. In this case you're asking for MFS to push data to another site. You'd then be relying on a single piece of software that could equally corrupt your live site and your DR site. Ben On Apr 3, 2012, at 3:36 PM, Quenten Grasso wrote: > Hi All, > > How large is your metadata & logs at this stage? Just trying to mitigate this exact issue myself. > > I was planning to create hourly snapshots (as I understand the way they are implemented they don't affect performance unlike a vmware snapshot please correct me if I'm wrong) and copy these offsite to another mfs/cluster using rsync w/ snapshots on the other site with maybe a goal of 2 at most and using a goal of 3 on site. > > I guess the big issue here is storing our data 5 times in total vs. tapes however I guess it would be "quicker" to recover from a "failure" having a running cluster on site b vs a tape backup and dare i say it (possibly) more reliable then a singular tape and tape library. > > Also I've been tossing up the idea of using ZFS for storage, reason I say this is because I know mfs has built in check-summing/aka zfs and all that good stuff, however having to store our data 3 times + 2 times is expensive maybe storing it 2+1 instead would work out at scale by using the likes of ZFS for reliability then using mfs for purely for availability instead of reliability & availability as well... > > Would be great if there was away to use some kind of rack awareness to say at all times keep goal of 1 or 2 of the data offsite on our 2nd mfs cluster. When I was speaking to one of the staff of the mfs support team they mentioned this was kind of being developed for another customer, So we may see some kind of solution? > > Quenten > > -----Original Message----- > From: Allen, Benjamin S [mailto:bs...@la...] > Sent: Wednesday, 4 April 2012 7:17 AM > To: moo...@li... > Subject: Re: [Moosefs-users] Backup strategies > > Similar plan here. > > I have a dedicated server for MFS backup purposes. We're using IBM's Tivoli to push to a large GPFS archive system backed with a SpectraLogic tape library. I have the standard Linux Tivoli client running on this host. One key with Tivoli is to use the DiskCacheMethod, and set the disk cache to be somewhere on local disk instead of the root of the mfs mount. > > Also I backup mfsmaster's files every hour and retain at least a week of these backups. From the various horror stories we've heard on this mailing list, all have been from corrupt metadata files from mfsmaster. It's a really good idea to limit your exposure to this. > > For good measure I also backup metalogger's files every night. > > One dream for backup of MFS is to somehow utilize the metadata files dumped by mfsmaster or metalogger, to be able to do a metadata "diff". The goal of this process would be to produce a list of all objects in the filesystem that have changed between two metadata.mfs.back files. Thus you could feed your backup client a list of files, without having the need for the client to inspect the filesystem itself. This idea is inspired by ZFS' diff functionality. Where ZFS can show the changes between a snapshot and the live filesystem. > > Ben > > On Apr 3, 2012, at 2:18 PM, Atom Powers wrote: > >> I've been thinking about this for a while and I think occam's razor (the >> simplest ideas is the best) might provide some guidance. >> >> MooseFS is fault-tolerant; so you can mitigate "hardware failure". >> MooseFS provides a trash space, so you can mitigate "accidental >> deletion" events. >> MooseFS provides snapshots, so you can mitigate "corruption" events. >> >> The remaining scenario, "somebody stashes a nuclear warhead in the >> locker room", requires off-site backup. If "rack awareness" was able to >> guarantee chucks in multiple locations, then that would mitigate this >> event. Since it can't I'm going to be sending data off-site using a >> large LTO5 tape library managed by Bacula on a server that also runs >> mfsmount of the entire system. >> >> On 04/03/2012 12:56 PM, Steve Thompson wrote: >>> OK, so now you have a nice and shiny and absolutely massive MooseFS file >>> system. How do you back it up? >>> >>> I am using Bacula and divide the MFS file system into separate areas (eg >>> directories beginning with a, those beginning with b, and so on) and use >>> several different chunkservers to run the backup jobs, on the theory that >>> at least some of the data is local to the backup process. But this still >>> leaves the vast majority of data to travel the network twice (a planned >>> dedicated storage network has not yet been implemented). This results in >>> pretty bad backup performance and high network load. Any clever ideas? >>> >>> Steve >> >> -- >> -- >> Perfection is just a word I use occasionally with mustard. >> --Atom Powers-- >> Director of IT >> DigiPen Institute of Technology >> +1 (425) 895-4443 >> >> ------------------------------------------------------------------------------ >> Better than sec? Nothing is better than sec when it comes to >> monitoring Big Data applications. Try Boundary one-second >> resolution app monitoring today. Free. >> http://p.sf.net/sfu/Boundary-dev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Wang J. <jia...@re...> - 2012-04-04 05:25:08
|
For desasters such as earthquake, fire, and flood, off-site backup is must-have, and any RAID level solution is sheer futile. As Atom Powers said, Moosefs should provide off-site backup mechanism. Months before, my colleague Ken Shao sent in some patches to provide "class" based goal mechanism, which enables us to define different "class" to differentiate physical location and backup data in other physical locations (i.e, 500km - 1000km away). The design principles are: 1. We can afford to lose some data during the backup point and disaster point. In this case, old data or old version of data are intact, new data or new version of data are lost. 2. Because cluster-to-cluster backup has many drawbacks (performance, consistency, etc), the duplication from one location to another location should be within a single cluster. 3. Location-to-location duplication should not happen when writing, or the performance/latency is hurt badly. So, the goal recovery mechanism can be and should be used (CS to CS duplication). And to improve bandwidth efficiency and avoid peek load time, duplication can be controlled in timely manner, and dirty/delta algorithm should be used. 4. Meta data should be logger to the backup site. When disaster happens, the backup site can be promoted to master site. The current rack awareness implementation is not the very thing we are looking forward to. Seriously speaking, as 10gb ether connection is getting cheaper and cheaper, the traditional rack awareness is rendered useless. 于 2012/4/4 7:13, Allen, Benjamin S 写道: > Quenten, > > I'm using MFS with ZFS. I use ZFS for RAIDZ2 (RAID6) and hot sparing on each chunkserver. I then only set a goal of 2 in MFS. I also have a "scratch" directory within MFS that is set to goal 1 and not backed up to tape. I attempt to get my users to organize their data between their data directory and scratch to minimize goal overhead for data that doesn't require it. > > Overhead of my particular ZFS setup is ~15% lost to parity and hot spares. Although I was a bit bold with my RAIDZ2 configuration, which will cause rebuild time to be quite long in trade off for lower overhead. This was done with the knowledge that RAIDZ2 can withstand two drive failures, and MFS would have another copy of the data on another chunk server. I have not however tested how well MFS handles a ZFS pool degraded with data loss. I'm guessing I would take the chunkserver daemon offline, get the ZFS pool into a rebuilding state, and restart the CS. I'm guessing the CS will see missing chunks, mark them undergoal, and re-replicate them. > > A more cautious RAID set would be closer to 30% overhead. > > Then of course with goal 2 you lose another 50%. > > A side benefit of using ZFS is on-the-fly compression and de-dup of your chunkserver, L2ARC SSD read cache (although it turns out most of my cache hits are from L1ARC, i.e. memory), and to speed up writes you can add a pair of ZIL SSDs. > > For disaster recovery you always need to be extra careful when relying on a single system todo your live and DR sites. In this case you're asking for MFS to push data to another site. You'd then be relying on a single piece of software that could equally corrupt your live site and your DR site. > > Ben > > On Apr 3, 2012, at 3:36 PM, Quenten Grasso wrote: > >> Hi All, >> >> How large is your metadata& logs at this stage? Just trying to mitigate this exact issue myself. >> >> I was planning to create hourly snapshots (as I understand the way they are implemented they don't affect performance unlike a vmware snapshot please correct me if I'm wrong) and copy these offsite to another mfs/cluster using rsync w/ snapshots on the other site with maybe a goal of 2 at most and using a goal of 3 on site. >> >> I guess the big issue here is storing our data 5 times in total vs. tapes however I guess it would be "quicker" to recover from a "failure" having a running cluster on site b vs a tape backup and dare i say it (possibly) more reliable then a singular tape and tape library. >> >> Also I've been tossing up the idea of using ZFS for storage, reason I say this is because I know mfs has built in check-summing/aka zfs and all that good stuff, however having to store our data 3 times + 2 times is expensive maybe storing it 2+1 instead would work out at scale by using the likes of ZFS for reliability then using mfs for purely for availability instead of reliability& availability as well... >> >> Would be great if there was away to use some kind of rack awareness to say at all times keep goal of 1 or 2 of the data offsite on our 2nd mfs cluster. When I was speaking to one of the staff of the mfs support team they mentioned this was kind of being developed for another customer, So we may see some kind of solution? >> >> Quenten >> >> -----Original Message----- >> From: Allen, Benjamin S [mailto:bs...@la...] >> Sent: Wednesday, 4 April 2012 7:17 AM >> To: moo...@li... >> Subject: Re: [Moosefs-users] Backup strategies >> >> Similar plan here. >> >> I have a dedicated server for MFS backup purposes. We're using IBM's Tivoli to push to a large GPFS archive system backed with a SpectraLogic tape library. I have the standard Linux Tivoli client running on this host. One key with Tivoli is to use the DiskCacheMethod, and set the disk cache to be somewhere on local disk instead of the root of the mfs mount. >> >> Also I backup mfsmaster's files every hour and retain at least a week of these backups. From the various horror stories we've heard on this mailing list, all have been from corrupt metadata files from mfsmaster. It's a really good idea to limit your exposure to this. >> >> For good measure I also backup metalogger's files every night. >> >> One dream for backup of MFS is to somehow utilize the metadata files dumped by mfsmaster or metalogger, to be able to do a metadata "diff". The goal of this process would be to produce a list of all objects in the filesystem that have changed between two metadata.mfs.back files. Thus you could feed your backup client a list of files, without having the need for the client to inspect the filesystem itself. This idea is inspired by ZFS' diff functionality. Where ZFS can show the changes between a snapshot and the live filesystem. >> >> Ben >> >> On Apr 3, 2012, at 2:18 PM, Atom Powers wrote: >> >>> I've been thinking about this for a while and I think occam's razor (the >>> simplest ideas is the best) might provide some guidance. >>> >>> MooseFS is fault-tolerant; so you can mitigate "hardware failure". >>> MooseFS provides a trash space, so you can mitigate "accidental >>> deletion" events. >>> MooseFS provides snapshots, so you can mitigate "corruption" events. >>> >>> The remaining scenario, "somebody stashes a nuclear warhead in the >>> locker room", requires off-site backup. If "rack awareness" was able to >>> guarantee chucks in multiple locations, then that would mitigate this >>> event. Since it can't I'm going to be sending data off-site using a >>> large LTO5 tape library managed by Bacula on a server that also runs >>> mfsmount of the entire system. >>> >>> On 04/03/2012 12:56 PM, Steve Thompson wrote: >>>> OK, so now you have a nice and shiny and absolutely massive MooseFS file >>>> system. How do you back it up? >>>> >>>> I am using Bacula and divide the MFS file system into separate areas (eg >>>> directories beginning with a, those beginning with b, and so on) and use >>>> several different chunkservers to run the backup jobs, on the theory that >>>> at least some of the data is local to the backup process. But this still >>>> leaves the vast majority of data to travel the network twice (a planned >>>> dedicated storage network has not yet been implemented). This results in >>>> pretty bad backup performance and high network load. Any clever ideas? >>>> >>>> Steve >>> -- >>> -- >>> Perfection is just a word I use occasionally with mustard. >>> --Atom Powers-- >>> Director of IT >>> DigiPen Institute of Technology >>> +1 (425) 895-4443 >>> >>> ------------------------------------------------------------------------------ >>> Better than sec? Nothing is better than sec when it comes to >>> monitoring Big Data applications. Try Boundary one-second >>> resolution app monitoring today. Free. >>> http://p.sf.net/sfu/Boundary-dev2dev >>> _______________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> ------------------------------------------------------------------------------ >> Better than sec? Nothing is better than sec when it comes to >> monitoring Big Data applications. Try Boundary one-second >> resolution app monitoring today. Free. >> http://p.sf.net/sfu/Boundary-dev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users >> >> ------------------------------------------------------------------------------ >> Better than sec? Nothing is better than sec when it comes to >> monitoring Big Data applications. Try Boundary one-second >> resolution app monitoring today. Free. >> http://p.sf.net/sfu/Boundary-dev2dev >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Ken <ken...@gm...> - 2012-04-04 07:21:27
|
more detail here: http://sourceforge.net/mailarchive/message.php?msg_id=28664530 -Ken On Wed, Apr 4, 2012 at 1:24 PM, Wang Jian <jia...@re...> wrote: > For desasters such as earthquake, fire, and flood, off-site backup is > must-have, and any RAID level solution is sheer futile. > > As Atom Powers said, Moosefs should provide off-site backup mechanism. > > Months before, my colleague Ken Shao sent in some patches to provide > "class" based goal mechanism, which enables us to define different "class" > to differentiate physical location and backup data in other physical > locations (i.e, 500km - 1000km away). > > The design principles are: > > 1. We can afford to lose some data during the backup point and disaster > point. In this case, old data or old version of data are intact, new data > or new version of data are lost. > 2. Because cluster-to-cluster backup has many drawbacks (performance, > consistency, etc), the duplication from one location to another location > should be within a single cluster. > 3. Location-to-location duplication should not happen when writing, or the > performance/latency is hurt badly. So, the goal recovery mechanism can be > and should be used (CS to CS duplication). And to improve bandwidth > efficiency and avoid peek load time, duplication can be controlled in > timely manner, and dirty/delta algorithm should be used. > 4. Meta data should be logger to the backup site. When disaster happens, > the backup site can be promoted to master site. > > The current rack awareness implementation is not the very thing we are > looking forward to. > > Seriously speaking, as 10gb ether connection is getting cheaper and > cheaper, the traditional rack awareness is rendered useless. > > > 于 2012/4/4 7:13, Allen, Benjamin S 写道: > >> Quenten, >> >> I'm using MFS with ZFS. I use ZFS for RAIDZ2 (RAID6) and hot sparing on >> each chunkserver. I then only set a goal of 2 in MFS. I also have a >> "scratch" directory within MFS that is set to goal 1 and not backed up to >> tape. I attempt to get my users to organize their data between their data >> directory and scratch to minimize goal overhead for data that doesn't >> require it. >> >> Overhead of my particular ZFS setup is ~15% lost to parity and hot >> spares. Although I was a bit bold with my RAIDZ2 configuration, which will >> cause rebuild time to be quite long in trade off for lower overhead. This >> was done with the knowledge that RAIDZ2 can withstand two drive failures, >> and MFS would have another copy of the data on another chunk server. I have >> not however tested how well MFS handles a ZFS pool degraded with data loss. >> I'm guessing I would take the chunkserver daemon offline, get the ZFS pool >> into a rebuilding state, and restart the CS. I'm guessing the CS will see >> missing chunks, mark them undergoal, and re-replicate them. >> >> A more cautious RAID set would be closer to 30% overhead. >> >> Then of course with goal 2 you lose another 50%. >> >> A side benefit of using ZFS is on-the-fly compression and de-dup of your >> chunkserver, L2ARC SSD read cache (although it turns out most of my cache >> hits are from L1ARC, i.e. memory), and to speed up writes you can add a >> pair of ZIL SSDs. >> >> For disaster recovery you always need to be extra careful when relying on >> a single system todo your live and DR sites. In this case you're asking for >> MFS to push data to another site. You'd then be relying on a single piece >> of software that could equally corrupt your live site and your DR site. >> >> Ben >> >> On Apr 3, 2012, at 3:36 PM, Quenten Grasso wrote: >> >> Hi All, >>> >>> How large is your metadata& logs at this stage? Just trying to mitigate >>> this exact issue myself. >>> >>> >>> I was planning to create hourly snapshots (as I understand the way they >>> are implemented they don't affect performance unlike a vmware snapshot >>> please correct me if I'm wrong) and copy these offsite to another >>> mfs/cluster using rsync w/ snapshots on the other site with maybe a goal of >>> 2 at most and using a goal of 3 on site. >>> >>> I guess the big issue here is storing our data 5 times in total vs. >>> tapes however I guess it would be "quicker" to recover from a "failure" >>> having a running cluster on site b vs a tape backup and dare i say it >>> (possibly) more reliable then a singular tape and tape library. >>> >>> Also I've been tossing up the idea of using ZFS for storage, reason I >>> say this is because I know mfs has built in check-summing/aka zfs and all >>> that good stuff, however having to store our data 3 times + 2 times is >>> expensive maybe storing it 2+1 instead would work out at scale by using the >>> likes of ZFS for reliability then using mfs for purely for availability >>> instead of reliability& availability as well... >>> >>> >>> Would be great if there was away to use some kind of rack awareness to >>> say at all times keep goal of 1 or 2 of the data offsite on our 2nd mfs >>> cluster. When I was speaking to one of the staff of the mfs support team >>> they mentioned this was kind of being developed for another customer, So we >>> may see some kind of solution? >>> >>> Quenten >>> >>> -----Original Message----- >>> From: Allen, Benjamin S [mailto:bs...@la...] >>> Sent: Wednesday, 4 April 2012 7:17 AM >>> To: moosefs-users@lists.**sourceforge.net<moo...@li...> >>> Subject: Re: [Moosefs-users] Backup strategies >>> >>> Similar plan here. >>> >>> I have a dedicated server for MFS backup purposes. We're using IBM's >>> Tivoli to push to a large GPFS archive system backed with a SpectraLogic >>> tape library. I have the standard Linux Tivoli client running on this host. >>> One key with Tivoli is to use the DiskCacheMethod, and set the disk cache >>> to be somewhere on local disk instead of the root of the mfs mount. >>> >>> Also I backup mfsmaster's files every hour and retain at least a week of >>> these backups. From the various horror stories we've heard on this mailing >>> list, all have been from corrupt metadata files from mfsmaster. It's a >>> really good idea to limit your exposure to this. >>> >>> For good measure I also backup metalogger's files every night. >>> >>> One dream for backup of MFS is to somehow utilize the metadata files >>> dumped by mfsmaster or metalogger, to be able to do a metadata "diff". The >>> goal of this process would be to produce a list of all objects in the >>> filesystem that have changed between two metadata.mfs.back files. Thus you >>> could feed your backup client a list of files, without having the need for >>> the client to inspect the filesystem itself. This idea is inspired by ZFS' >>> diff functionality. Where ZFS can show the changes between a snapshot and >>> the live filesystem. >>> >>> Ben >>> >>> On Apr 3, 2012, at 2:18 PM, Atom Powers wrote: >>> >>> I've been thinking about this for a while and I think occam's razor (the >>>> simplest ideas is the best) might provide some guidance. >>>> >>>> MooseFS is fault-tolerant; so you can mitigate "hardware failure". >>>> MooseFS provides a trash space, so you can mitigate "accidental >>>> deletion" events. >>>> MooseFS provides snapshots, so you can mitigate "corruption" events. >>>> >>>> The remaining scenario, "somebody stashes a nuclear warhead in the >>>> locker room", requires off-site backup. If "rack awareness" was able to >>>> guarantee chucks in multiple locations, then that would mitigate this >>>> event. Since it can't I'm going to be sending data off-site using a >>>> large LTO5 tape library managed by Bacula on a server that also runs >>>> mfsmount of the entire system. >>>> >>>> On 04/03/2012 12:56 PM, Steve Thompson wrote: >>>> >>>>> OK, so now you have a nice and shiny and absolutely massive MooseFS >>>>> file >>>>> system. How do you back it up? >>>>> >>>>> I am using Bacula and divide the MFS file system into separate areas >>>>> (eg >>>>> directories beginning with a, those beginning with b, and so on) and >>>>> use >>>>> several different chunkservers to run the backup jobs, on the theory >>>>> that >>>>> at least some of the data is local to the backup process. But this >>>>> still >>>>> leaves the vast majority of data to travel the network twice (a planned >>>>> dedicated storage network has not yet been implemented). This results >>>>> in >>>>> pretty bad backup performance and high network load. Any clever ideas? >>>>> >>>>> Steve >>>>> >>>> -- >>>> -- >>>> Perfection is just a word I use occasionally with mustard. >>>> --Atom Powers-- >>>> Director of IT >>>> DigiPen Institute of Technology >>>> +1 (425) 895-4443 >>>> >>>> ------------------------------**------------------------------** >>>> ------------------ >>>> Better than sec? Nothing is better than sec when it comes to >>>> monitoring Big Data applications. Try Boundary one-second >>>> resolution app monitoring today. Free. >>>> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev> >>>> ______________________________**_________________ >>>> moosefs-users mailing list >>>> moosefs-users@lists.**sourceforge.net<moo...@li...> >>>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users> >>>> >>> >>> ------------------------------**------------------------------** >>> ------------------ >>> Better than sec? Nothing is better than sec when it comes to >>> monitoring Big Data applications. Try Boundary one-second >>> resolution app monitoring today. Free. >>> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev> >>> ______________________________**_________________ >>> moosefs-users mailing list >>> moosefs-users@lists.**sourceforge.net<moo...@li...> >>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users> >>> >>> ------------------------------**------------------------------** >>> ------------------ >>> Better than sec? Nothing is better than sec when it comes to >>> monitoring Big Data applications. Try Boundary one-second >>> resolution app monitoring today. Free. >>> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev> >>> ______________________________**_________________ >>> moosefs-users mailing list >>> moosefs-users@lists.**sourceforge.net<moo...@li...> >>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users> >>> >> >> ------------------------------**------------------------------** >> ------------------ >> Better than sec? Nothing is better than sec when it comes to >> monitoring Big Data applications. Try Boundary one-second >> resolution app monitoring today. Free. >> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev> >> ______________________________**_________________ >> moosefs-users mailing list >> moosefs-users@lists.**sourceforge.net<moo...@li...> >> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users> >> >> > |
From: Quenten G. <QG...@on...> - 2012-04-04 11:00:33
|
Very interesting Ken, ☺ Just thinking out loud It’d be great to see some kind of SSD integration so it was possible to have multiple tiers of storage as well, The SSD’s would be useful for Gold Virtual machine images / data that needs to be accessed or written to fast and maybe a replica is set on standard sata disks instead of using zfs’s caching we could use a couple of cheaper ssd’s in each node to act as caches similarly as to how the high end gear aka netapp etc works these days I agree the metadata server is the likely bottleneck but we are also planning to try and store our data inside a large files (virtual machine images). I guess I’m not quite sure if the performance in any distributed file system in general is were I expect yet. Even using 10gbe, or 20gbps Infiniband/IPoIB I found this an interesting read http://forums.gentoo.org/viewtopic-p-6875454.html?sid=d4299cd9365550ac3940a0f8a5beff46 imoo Regards, Quenten Grasso From: Ken [mailto:ken...@gm...] Sent: Wednesday, 4 April 2012 5:21 PM To: Wang Jian Cc: moo...@li...<mailto:moo...@li...>; Quenten Grasso Subject: Re: [Moosefs-users] Backup strategies more detail here: http://sourceforge.net/mailarchive/message.php?msg_id=28664530 -Ken |
From: Steve W. <st...@pu...> - 2012-04-04 17:29:50
|
On 04/03/2012 03:56 PM, Steve Thompson wrote: > OK, so now you have a nice and shiny and absolutely massive MooseFS file > system. How do you back it up? > > I am using Bacula and divide the MFS file system into separate areas (eg > directories beginning with a, those beginning with b, and so on) and use > several different chunkservers to run the backup jobs, on the theory that > at least some of the data is local to the backup process. But this still > leaves the vast majority of data to travel the network twice (a planned > dedicated storage network has not yet been implemented). This results in > pretty bad backup performance and high network load. Any clever ideas? > > Steve We have four 22TB and one 14TB MooseFS volumes that we backup onto disk-based backup servers. We used to use rsnapshot but now we use rsync in combination with ZFS snapshots. Each evening before our backup run, we take a snapshot on the backup filesystem and label it with the date. Then we run rsync on the volumes being backed up and only what has been modified since the previous backup is transfered over the network. The result is the equivalent of taking a full backup each night and it's very easy to recover data. I also use ZFS compression and dedup to help conserve space on our backup servers. The dedup option is especially helpful when a user decides to rename a large directory; rsync may have to bring it across the network and write it to the filesystem but ZFS will recognize the data as duplicates of already stored data. Steve |
From: Michał B. <mic...@co...> - 2012-04-20 19:54:28
|
Hi Group! As already mentioned we'd just like to stress how important it is to make regular metadata backups. The future 1.6.26 version will keep several metadata files back on the disk (as already done by Ben) and still it would be wise to copy them somewhere else. The full rack awareness which you'd expect would probably be introduced in a way suggested by Ken with levelgoals. But still remember that running such a MooseFS installation would require a really quick connection between the two sites which may be difficult to accomplish. Synchronizing two MooseFS installations might be really better done with simple rsync. PS. @Ken, mind that your solution (at least when we looked at it) didn't have support for levelgoals in mfsmetarestore (did you run a recovery test?) and in mfsmetadump. Kind regards Michał Borychowski MooseFS Support Manager -----Original Message----- From: Steve Wilson [mailto:st...@pu...] Sent: Wednesday, April 04, 2012 7:30 PM To: moo...@li... Subject: Re: [Moosefs-users] Backup strategies On 04/03/2012 03:56 PM, Steve Thompson wrote: > OK, so now you have a nice and shiny and absolutely massive MooseFS > file system. How do you back it up? > > I am using Bacula and divide the MFS file system into separate areas > (eg directories beginning with a, those beginning with b, and so on) > and use several different chunkservers to run the backup jobs, on the > theory that at least some of the data is local to the backup process. > But this still leaves the vast majority of data to travel the network > twice (a planned dedicated storage network has not yet been > implemented). This results in pretty bad backup performance and high network load. Any clever ideas? > > Steve We have four 22TB and one 14TB MooseFS volumes that we backup onto disk-based backup servers. We used to use rsnapshot but now we use rsync in combination with ZFS snapshots. Each evening before our backup run, we take a snapshot on the backup filesystem and label it with the date. Then we run rsync on the volumes being backed up and only what has been modified since the previous backup is transfered over the network. The result is the equivalent of taking a full backup each night and it's very easy to recover data. I also use ZFS compression and dedup to help conserve space on our backup servers. The dedup option is especially helpful when a user decides to rename a large directory; rsync may have to bring it across the network and write it to the filesystem but ZFS will recognize the data as duplicates of already stored data. Steve ---------------------------------------------------------------------------- -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |