Re: [Moosefs-users] Backup strategies

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

more detail here:
  http://sourceforge.net/mailarchive/message.php?msg_id=28664530

-Ken


On Wed, Apr 4, 2012 at 1:24 PM, Wang Jian <jia...@re...> wrote:

> For desasters such as earthquake, fire, and flood, off-site backup is
> must-have, and any RAID level solution is sheer futile.
>
> As Atom Powers said, Moosefs should provide off-site backup mechanism.
>
> Months before, my colleague Ken Shao sent in some patches to provide
> "class" based goal mechanism, which enables us to define different "class"
> to differentiate physical location and backup data in other physical
> locations (i.e, 500km - 1000km away).
>
> The design principles are:
>
> 1. We can afford to lose some data during the backup point and disaster
> point. In this case, old data or old version of data are intact, new data
> or new version of data are lost.
> 2. Because cluster-to-cluster backup has many drawbacks (performance,
> consistency, etc), the duplication from one location to another location
> should be within a single cluster.
> 3. Location-to-location duplication should not happen when writing, or the
> performance/latency is hurt badly. So, the goal recovery mechanism can be
> and should be used (CS to CS duplication). And to improve bandwidth
> efficiency and avoid peek load time, duplication can be controlled in
> timely manner, and dirty/delta algorithm should be used.
> 4. Meta data should be logger to the backup site. When disaster happens,
> the backup site can be promoted to master site.
>
> The current rack awareness implementation is not the very thing we are
> looking forward to.
>
> Seriously speaking, as 10gb ether connection is getting cheaper and
> cheaper, the traditional rack awareness is rendered useless.
>
>
> 于 2012/4/4 7:13, Allen, Benjamin S 写道:
>
>> Quenten,
>>
>> I'm using MFS with ZFS. I use ZFS for RAIDZ2 (RAID6) and hot sparing on
>> each chunkserver. I then only set a goal of 2 in MFS. I also have a
>> "scratch" directory within MFS that is set to goal 1 and not backed up to
>> tape. I attempt to get my users to organize their data between their data
>> directory and scratch to minimize goal overhead for data that doesn't
>> require it.
>>
>> Overhead of my particular ZFS setup is ~15% lost to parity and hot
>> spares. Although I was a bit bold with my RAIDZ2 configuration, which will
>> cause rebuild time to be quite long in trade off for lower overhead. This
>> was done with the knowledge that RAIDZ2 can withstand two drive failures,
>> and MFS would have another copy of the data on another chunk server. I have
>> not however tested how well MFS handles a ZFS pool degraded with data loss.
>> I'm guessing I would take the chunkserver daemon offline, get the ZFS pool
>> into a rebuilding state, and restart the CS. I'm guessing the CS will see
>> missing chunks, mark them undergoal, and re-replicate them.
>>
>> A more cautious RAID set would be closer to 30% overhead.
>>
>> Then of course with goal 2 you lose another 50%.
>>
>> A side benefit of using ZFS is on-the-fly compression and de-dup of your
>> chunkserver, L2ARC SSD read cache (although it turns out most of my cache
>> hits are from L1ARC, i.e. memory), and to speed up writes you can add a
>> pair of ZIL SSDs.
>>
>> For disaster recovery you always need to be extra careful when relying on
>> a single system todo your live and DR sites. In this case you're asking for
>> MFS to push data to another site. You'd then be relying on a single piece
>> of software that could equally corrupt your live site and your DR site.
>>
>> Ben
>>
>> On Apr 3, 2012, at 3:36 PM, Quenten Grasso wrote:
>>
>>  Hi All,
>>>
>>> How large is your metadata&  logs at this stage? Just trying to mitigate
>>> this exact issue myself.
>>>
>>>
>>> I was planning to create hourly snapshots (as I understand the way they
>>> are implemented they don't affect performance unlike a vmware snapshot
>>> please correct me if I'm wrong) and copy these offsite to another
>>> mfs/cluster using rsync w/ snapshots on the other site with maybe a goal of
>>> 2 at most and using a goal of 3 on site.
>>>
>>> I guess the big issue here is storing our data 5 times in total vs.
>>> tapes however I guess it would be "quicker" to recover from a "failure"
>>> having a running cluster on site b vs a tape backup and dare i say it
>>> (possibly) more reliable then a singular tape and tape library.
>>>
>>> Also I've been tossing up the idea of using ZFS for storage, reason I
>>> say this is because I know mfs has built in check-summing/aka zfs and all
>>> that good stuff, however having to store our data 3 times + 2 times is
>>> expensive maybe storing it 2+1 instead would work out at scale by using the
>>> likes of ZFS for reliability then using mfs for purely for availability
>>> instead of reliability&  availability as well...
>>>
>>>
>>> Would be great if there was away to use some kind of rack awareness to
>>> say at all times keep goal of 1 or 2 of the data offsite on our 2nd mfs
>>> cluster. When I was speaking to one of the staff of the mfs support team
>>> they mentioned this was kind of being developed for another customer, So we
>>> may see some kind of solution?
>>>
>>> Quenten
>>>
>>> -----Original Message-----
>>> From: Allen, Benjamin S [mailto:bs...@la...]
>>> Sent: Wednesday, 4 April 2012 7:17 AM
>>> To: moosefs-users@lists.**sourceforge.net<moo...@li...>
>>> Subject: Re: [Moosefs-users] Backup strategies
>>>
>>> Similar plan here.
>>>
>>> I have a dedicated server for MFS backup purposes. We're using IBM's
>>> Tivoli to push to a large GPFS archive system backed with a SpectraLogic
>>> tape library. I have the standard Linux Tivoli client running on this host.
>>> One key with Tivoli is to use the DiskCacheMethod, and set the disk cache
>>> to be somewhere on local disk instead of the root of the mfs mount.
>>>
>>> Also I backup mfsmaster's files every hour and retain at least a week of
>>> these backups. From the various horror stories we've heard on this mailing
>>> list, all have been from corrupt metadata files from mfsmaster. It's a
>>> really good idea to limit your exposure to this.
>>>
>>> For good measure I also backup metalogger's files every night.
>>>
>>> One dream for backup of MFS is to somehow utilize the metadata files
>>> dumped by mfsmaster or metalogger, to be able to do a metadata "diff". The
>>> goal of this process would be to produce a list of all objects in the
>>> filesystem that have changed between two metadata.mfs.back files. Thus you
>>> could feed your backup client a list of files, without having the need for
>>> the client to inspect the filesystem itself. This idea is inspired by ZFS'
>>> diff functionality. Where ZFS can show the changes between a snapshot and
>>> the live filesystem.
>>>
>>> Ben
>>>
>>> On Apr 3, 2012, at 2:18 PM, Atom Powers wrote:
>>>
>>>  I've been thinking about this for a while and I think occam's razor (the
>>>> simplest ideas is the best) might provide some guidance.
>>>>
>>>> MooseFS is fault-tolerant; so you can mitigate "hardware failure".
>>>> MooseFS provides a trash space, so you can mitigate "accidental
>>>> deletion" events.
>>>> MooseFS provides snapshots, so you can mitigate "corruption" events.
>>>>
>>>> The remaining scenario, "somebody stashes a nuclear warhead in the
>>>> locker room", requires off-site backup. If "rack awareness" was able to
>>>> guarantee chucks in multiple locations, then that would mitigate this
>>>> event. Since it can't I'm going to be sending data off-site using a
>>>> large LTO5 tape library managed by Bacula on a server that also runs
>>>> mfsmount of the entire system.
>>>>
>>>> On 04/03/2012 12:56 PM, Steve Thompson wrote:
>>>>
>>>>> OK, so now you have a nice and shiny and absolutely massive MooseFS
>>>>> file
>>>>> system. How do you back it up?
>>>>>
>>>>> I am using Bacula and divide the MFS file system into separate areas
>>>>> (eg
>>>>> directories beginning with a, those beginning with b, and so on) and
>>>>> use
>>>>> several different chunkservers to run the backup jobs, on the theory
>>>>> that
>>>>> at least some of the data is local to the backup process. But this
>>>>> still
>>>>> leaves the vast majority of data to travel the network twice (a planned
>>>>> dedicated storage network has not yet been implemented). This results
>>>>> in
>>>>> pretty bad backup performance and high network load. Any clever ideas?
>>>>>
>>>>> Steve
>>>>>
>>>> --
>>>> --
>>>> Perfection is just a word I use occasionally with mustard.
>>>> --Atom Powers--
>>>> Director of IT
>>>> DigiPen Institute of Technology
>>>> +1 (425) 895-4443
>>>>
>>>> ------------------------------**------------------------------**
>>>> ------------------
>>>> Better than sec? Nothing is better than sec when it comes to
>>>> monitoring Big Data applications. Try Boundary one-second
>>>> resolution app monitoring today. Free.
>>>> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev>
>>>> ______________________________**_________________
>>>> moosefs-users mailing list
>>>> moosefs-users@lists.**sourceforge.net<moo...@li...>
>>>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users>
>>>>
>>>
>>> ------------------------------**------------------------------**
>>> ------------------
>>> Better than sec? Nothing is better than sec when it comes to
>>> monitoring Big Data applications. Try Boundary one-second
>>> resolution app monitoring today. Free.
>>> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev>
>>> ______________________________**_________________
>>> moosefs-users mailing list
>>> moosefs-users@lists.**sourceforge.net<moo...@li...>
>>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users>
>>>
>>> ------------------------------**------------------------------**
>>> ------------------
>>> Better than sec? Nothing is better than sec when it comes to
>>> monitoring Big Data applications. Try Boundary one-second
>>> resolution app monitoring today. Free.
>>> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev>
>>> ______________________________**_________________
>>> moosefs-users mailing list
>>> moosefs-users@lists.**sourceforge.net<moo...@li...>
>>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users>
>>>
>>
>> ------------------------------**------------------------------**
>> ------------------
>> Better than sec? Nothing is better than sec when it comes to
>> monitoring Big Data applications. Try Boundary one-second
>> resolution app monitoring today. Free.
>> http://p.sf.net/sfu/Boundary-**dev2dev<http://p.sf.net/sfu/Boundary-dev2dev>
>> ______________________________**_________________
>> moosefs-users mailing list
>> moosefs-users@lists.**sourceforge.net<moo...@li...>
>> https://lists.sourceforge.net/**lists/listinfo/moosefs-users<https://lists.sourceforge.net/lists/listinfo/moosefs-users>
>>
>>
>

Re: [Moosefs-users] Backup strategies

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] Backup strategies