From: Craig B. <cba...@us...> - 2008-12-29 12:14:04
|
Jeff Kosowsky" writes: > I had been thinking of writing code to implement a robust fuse > filesystem for BackupPC backups but then I saw that John Craig (and > perhaps others) had started to write code. > > While the code still seems to be at the proof-of-concept I think the idea > is very powerful and extensible. I agree. I've been following the suggestions and proof-of-concept code with interest. I actually believe having a FUSE implementation that supports writing would be the best way to support rsync 3.x (and any other xfer methods for that matter). Assuming the performance was ok, the time-reversed delta format for storing backups that I'm planning for BackupPC 4.x would be most easily implemented with FUSE. I've been working on various CVS checkins for a 3.2.0 release (finally!), so I haven't had a chance to play with FUSE, other than installing it and perl FUSE on my CentOS 5.2 system. One question I'm curious about: if FUSE becomes a required part of BackupPC 4.x, does that unduly complicate installation or reduce the number of distros that BackupPC can readily run on? I realize FUSE is standard on recent 2.6.x kernels, but CentOS 5.2, as one example, doesn't enable FUSE, and it was actually quite a pain installing it, since the rpm package I found didn't install the kernel module. Craig |
From: Tino S. <bac...@ti...> - 2008-12-29 12:24:48
|
On Mon, Dec 29, 2008 at 04:13:59AM -0800, Craig Barratt wrote: > > I had been thinking of writing code to implement a robust fuse > > filesystem for BackupPC backups but then I saw that John Craig (and > > perhaps others) had started to write code. > > > > While the code still seems to be at the proof-of-concept I think the idea > > is very powerful and extensible. > > I agree. I've been following the suggestions and proof-of-concept > code with interest. > > I actually believe having a FUSE implementation that supports writing > would be the best way to support rsync 3.x (and any other xfer methods > for that matter). Assuming the performance was ok, the time-reversed > delta format for storing backups that I'm planning for BackupPC 4.x > would be most easily implemented with FUSE. > > I've been working on various CVS checkins for a 3.2.0 release > (finally!), so I haven't had a chance to play with FUSE, other > than installing it and perl FUSE on my CentOS 5.2 system. > > One question I'm curious about: if FUSE becomes a required part of > BackupPC 4.x, does that unduly complicate installation or reduce the > number of distros that BackupPC can readily run on? I realize FUSE > is standard on recent 2.6.x kernels, but CentOS 5.2, as one example, > doesn't enable FUSE, and it was actually quite a pain installing it, > since the rpm package I found didn't install the kernel module. I find the requirement of FUSE rather intrusive. Up to now, BackupPC did not depend on any special kernel infrastructure and is usable on Solaris, FreeBSD etc. since it only requires one special Perl module (File::RsyncP) plus some not-so-special ones. FUSE-mounted backups would be a quite useful extra feature, they should not be a requirement, though. I suppose, everything could be designed to have an internal filesystem view which is then used by the FUSE-module as well as BackupPC during backup. BTW: Why would that ease support for rsync 3.x? (Just curious.) Tino. -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de |
From: Jeffrey J. K. <bac...@ko...> - 2008-12-29 14:50:42
|
Craig Barratt wrote at about 04:53:04 -0800 on Monday, December 29, 2008: > Tino writes: > > > BTW: Why would that ease support for rsync 3.x? (Just curious.) > > Instead of updating File::RsyncP to rsync 3.x protocol, the idea > would be to use native rsync on both sides of the connection, > and the BackupPC trickery would be hidden behind FUSE. I like that aspect of it since updates to rsync protocol wouldn't then rely on updates to File::RsyncP > It's just an idea at this point. The rsync protocol isn't > documented; File::RsyncP was developed by carefully reading the > rsync source. It's certainly possible to update File::RsyncP for > rsync 3.x, but the development and testing effort is relatively > high. Just out of curiousity, why isn't it documented? I am surprised that such a ubiquitous building-block piece of open software is not documented. In my (simple) mind, I think of undocumented protocols as being at odds with the open software mentality. > Two benefits of using native rsync on the server side are > that a fuller set of command-line options could be used, and the > robustness would be better. One drawback is the rsync checksum > caching wouldn't work with FUSE. Wouldn't losing rsync checksum caching slow things down? Also, one thing I like about protocol 30 is that I believe it uses md5 checksums and I was hoping that by storing the md5sum checksum as part of the rsync checksum that we would now have a good md5sum checksum that could also easily be used for checking file integrity. > If I follow this path I would still expect to also support existing > BackupPC 3.x XferMethods, including File::RsyncP (up to its existing > protocol 28). So in that sense FUSE wouldn't be mandatory. > > Craig > |
From: Nils B. (Lemonbit) <ni...@le...> - 2008-12-29 15:00:22
|
Jeffrey J. Kosowsky wrote: > Craig Barratt wrote at about 04:53:04 -0800 on Monday, December 29, > 2008: > >> The rsync protocol isn't >> documented; File::RsyncP was developed by carefully reading the >> rsync source. It's certainly possible to update File::RsyncP for >> rsync 3.x, but the development and testing effort is relatively >> high. > Just out of curiousity, why isn't it documented? Ask the rsync guys. :o) Nils Breunese. |
From: Jon C. <can...@gm...> - 2008-12-30 01:54:05
|
On Mon, Dec 29, 2008 at 7:53 AM, Craig Barratt <cba...@us...> wrote: > Tino writes: > >> BTW: Why would that ease support for rsync 3.x? (Just curious.) > > Instead of updating File::RsyncP to rsync 3.x protocol, the idea > would be to use native rsync on both sides of the connection, > and the BackupPC trickery would be hidden behind FUSE. > The POC work I did on a FUSE interface for BackupPC never got to a point that was useful / releasable. What I quickly identified was that there were two approaches to doing a FUSE interface. The first was an interface that provided a fileystem layout over top of BackupPC (which is what I did) to provide fileystem type access to the backup catalog via BackupPC API. The second approach would involve a rewrite of BackupPC to extract the "dedupe" functionality and place it within the filesystem to provide an abstracted storage layer. The second approach is ultimately more interesting as it is no longer an extension of BackupPC and becomes a product that would stand on its own. BackupPC - Config, Schedule, Catalog Interface, Alert Notification FUSE Storage Engine - Dedupe, Catalog Replication for Disaster Recovery, General Interface to catalog Rsync - Replication Protocol SSH - Transport Protocol One weakness I can see with eliminating the Perl interface comes in the area of MD4 caching. Currently, BackupPC can speed the MD4 process by caching signatures to avoid recalculating MD4 hashes. While it would be trivial to extend the FUSE filesystem to store MD4 hashes, the generic rsync command doesnt provide an interface for providing these values to avoid the client from calculating the hashes on its own. The other thing I struggle with is how complex to make this storage layer. Do you tie it to a database (like mysql) to offload replication and speed access to file metadata (list generation and access to cached MD4). Do you extend rsync to provide hooks to make use of this functionality? Hmm, how big is this elephant and where do you start and how do you know when your done. -- Jonathan Craig |
From: Carl W. S. <ch...@re...> - 2008-12-30 16:11:58
|
On 12/29 08:54 , Jon Craig wrote: > The POC work I did on a FUSE interface for BackupPC never got to a > point that was useful / releasable. What I quickly identified was > that there were two approaches to doing a FUSE interface. The first > was an interface that provided a fileystem layout over top of BackupPC > (which is what I did) to provide fileystem type access to the backup > catalog via BackupPC API. Could you explain what this offers over the existing setup? It seems pretty simple to me, to go to /var/lib/backuppc/pc/<hostname>/<backupnumber>/fpath/fto/ffile. Tho obviously there's some room for improvement. :) - Automatic uncompression of compressed files? - Correct ownership/permissons of files? - More 'normal' paths to the files (i.e. no 'f' at the beginning of the words)? - Dates as well as backup numbers? I'm curious what you did with this. I might find such a thing convenient every now and then. -- Carl Soderstrom Systems Administrator Real-Time Enterprises www.real-time.com |
From: Jeffrey J. K. <bac...@ko...> - 2008-12-30 16:40:17
|
Carl Wilhelm Soderstrom wrote at about 10:11:37 -0600 on Tuesday, December 30, 2008: > On 12/29 08:54 , Jon Craig wrote: > > The POC work I did on a FUSE interface for BackupPC never got to a > > point that was useful / releasable. What I quickly identified was > > that there were two approaches to doing a FUSE interface. The first > > was an interface that provided a fileystem layout over top of BackupPC > > (which is what I did) to provide fileystem type access to the backup > > catalog via BackupPC API. > > Could you explain what this offers over the existing setup? It seems pretty > simple to me, to go to > /var/lib/backuppc/pc/<hostname>/<backupnumber>/fpath/fto/ffile. > > Tho obviously there's some room for improvement. :) > > - Automatic uncompression of compressed files? > - Correct ownership/permissons of files? > - More 'normal' paths to the files (i.e. no 'f' at the beginning of the > words)? > - Dates as well as backup numbers? > Well Fuse would do all the above (which would be VERY helpful for browsing backups) PLUS fill in the missing files for incrementals. Also you would have the ability to literally just mount a backup and get a snapshot of the backed-up filesystem. Personally, I much prefer a filesystem CLI interface then a web-based GUI which I find to be both slow and crippled since I can't use my stock unix commands to do what I want. The web GUI is good for newbies and also good for a quick status check but otherwise I find it way too slow and limiting. > I'm curious what you did with this. I might find such a thing convenient > every now and then. > > -- > Carl Soderstrom > Systems Administrator > Real-Time Enterprises > www.real-time.com > > ------------------------------------------------------------------------------ > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: Carl W. S. <ch...@re...> - 2008-12-30 17:23:31
|
On 12/30 11:39 , Jeffrey J. Kosowsky wrote: > Carl Wilhelm Soderstrom wrote at about 10:11:37 -0600 on Tuesday, December 30, 2008: > > On 12/29 08:54 , Jon Craig wrote: > > > The POC work I did on a FUSE interface for BackupPC never got to a > > > point that was useful / releasable. What I quickly identified was > > > that there were two approaches to doing a FUSE interface. The first > > > was an interface that provided a fileystem layout over top of BackupPC > > > (which is what I did) to provide fileystem type access to the backup > > > catalog via BackupPC API. > > > > Could you explain what this offers over the existing setup? It seems pretty > > simple to me, to go to > > /var/lib/backuppc/pc/<hostname>/<backupnumber>/fpath/fto/ffile. > > > > Tho obviously there's some room for improvement. :) > > > > - Automatic uncompression of compressed files? > > - Correct ownership/permissons of files? > > - More 'normal' paths to the files (i.e. no 'f' at the beginning of the > > words)? > > - Dates as well as backup numbers? > > > > Well Fuse would do all the above (which would be VERY helpful for > browsing backups) PLUS fill in the missing files for > incrementals. Also you would have the ability to literally just mount > a backup and get a snapshot of the backed-up filesystem. That makes sense. Is your code in a usable state? -- Carl Soderstrom Systems Administrator Real-Time Enterprises www.real-time.com |
From: dan <dan...@gm...> - 2008-12-31 22:45:36
|
I would think that the FUSE module would cause a pretty serious performance hit consider FUSE is not well known for performance. It make good sense to have a FUSE module for viewing a pool but I think that the backup process needs to stay far away from it... As far as using native rsync3 vs modifying File::RsyncP is a good idea but dont know how you would accomplish that outside of FUSE. On Tue, Dec 30, 2008 at 10:23 AM, Carl Wilhelm Soderstrom < ch...@re...> wrote: > On 12/30 11:39 , Jeffrey J. Kosowsky wrote: > > Carl Wilhelm Soderstrom wrote at about 10:11:37 -0600 on Tuesday, > December 30, 2008: > > > On 12/29 08:54 , Jon Craig wrote: > > > > The POC work I did on a FUSE interface for BackupPC never got to a > > > > point that was useful / releasable. What I quickly identified was > > > > that there were two approaches to doing a FUSE interface. The first > > > > was an interface that provided a fileystem layout over top of > BackupPC > > > > (which is what I did) to provide fileystem type access to the backup > > > > catalog via BackupPC API. > > > > > > Could you explain what this offers over the existing setup? It seems > pretty > > > simple to me, to go to > > > /var/lib/backuppc/pc/<hostname>/<backupnumber>/fpath/fto/ffile. > > > > > > Tho obviously there's some room for improvement. :) > > > > > > - Automatic uncompression of compressed files? > > > - Correct ownership/permissons of files? > > > - More 'normal' paths to the files (i.e. no 'f' at the beginning of > the > > > words)? > > > - Dates as well as backup numbers? > > > > > > > Well Fuse would do all the above (which would be VERY helpful for > > browsing backups) PLUS fill in the missing files for > > incrementals. Also you would have the ability to literally just mount > > a backup and get a snapshot of the backed-up filesystem. > > That makes sense. > Is your code in a usable state? > > -- > Carl Soderstrom > Systems Administrator > Real-Time Enterprises > www.real-time.com > > > ------------------------------------------------------------------------------ > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: dan <dan...@gm...> - 2008-12-31 22:46:57
|
Also, FUSE is standard on many linux distros but not so standard on freebsd or solaris.... You would essentially isolate backuppc to be linux specific. On Wed, Dec 31, 2008 at 3:45 PM, dan <dan...@gm...> wrote: > I would think that the FUSE module would cause a pretty serious performance > hit consider FUSE is not well known for performance. It make good sense to > have a FUSE module for viewing a pool but I think that the backup process > needs to stay far away from it... > > As far as using native rsync3 vs modifying File::RsyncP is a good idea but > dont know how you would accomplish that outside of FUSE. > > > On Tue, Dec 30, 2008 at 10:23 AM, Carl Wilhelm Soderstrom < > ch...@re...> wrote: > >> On 12/30 11:39 , Jeffrey J. Kosowsky wrote: >> > Carl Wilhelm Soderstrom wrote at about 10:11:37 -0600 on Tuesday, >> December 30, 2008: >> > > On 12/29 08:54 , Jon Craig wrote: >> > > > The POC work I did on a FUSE interface for BackupPC never got to a >> > > > point that was useful / releasable. What I quickly identified was >> > > > that there were two approaches to doing a FUSE interface. The >> first >> > > > was an interface that provided a fileystem layout over top of >> BackupPC >> > > > (which is what I did) to provide fileystem type access to the >> backup >> > > > catalog via BackupPC API. >> > > >> > > Could you explain what this offers over the existing setup? It seems >> pretty >> > > simple to me, to go to >> > > /var/lib/backuppc/pc/<hostname>/<backupnumber>/fpath/fto/ffile. >> > > >> > > Tho obviously there's some room for improvement. :) >> > > >> > > - Automatic uncompression of compressed files? >> > > - Correct ownership/permissons of files? >> > > - More 'normal' paths to the files (i.e. no 'f' at the beginning of >> the >> > > words)? >> > > - Dates as well as backup numbers? >> > > >> > >> > Well Fuse would do all the above (which would be VERY helpful for >> > browsing backups) PLUS fill in the missing files for >> > incrementals. Also you would have the ability to literally just mount >> > a backup and get a snapshot of the backed-up filesystem. >> >> That makes sense. >> Is your code in a usable state? >> >> -- >> Carl Soderstrom >> Systems Administrator >> Real-Time Enterprises >> www.real-time.com >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> BackupPC-users mailing list >> Bac...@li... >> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users >> Wiki: http://backuppc.wiki.sourceforge.net >> Project: http://backuppc.sourceforge.net/ >> > > |
From: Craig B. <cba...@us...> - 2008-12-29 12:53:10
|
Tino writes: > BTW: Why would that ease support for rsync 3.x? (Just curious.) Instead of updating File::RsyncP to rsync 3.x protocol, the idea would be to use native rsync on both sides of the connection, and the BackupPC trickery would be hidden behind FUSE. It's just an idea at this point. The rsync protocol isn't documented; File::RsyncP was developed by carefully reading the rsync source. It's certainly possible to update File::RsyncP for rsync 3.x, but the development and testing effort is relatively high. Two benefits of using native rsync on the server side are that a fuller set of command-line options could be used, and the robustness would be better. One drawback is the rsync checksum caching wouldn't work with FUSE. If I follow this path I would still expect to also support existing BackupPC 3.x XferMethods, including File::RsyncP (up to its existing protocol 28). So in that sense FUSE wouldn't be mandatory. Craig |
From: Tino S. <bac...@ti...> - 2008-12-29 13:03:51
|
On Mon, Dec 29, 2008 at 04:53:04AM -0800, Craig Barratt wrote: > > BTW: Why would that ease support for rsync 3.x? (Just curious.) > > Instead of updating File::RsyncP to rsync 3.x protocol, the idea > would be to use native rsync on both sides of the connection, > and the BackupPC trickery would be hidden behind FUSE. Ah, I see. That's a rather tempting design. I'd take a look at the file system operations rsync performs before implementing that. I remember that it always creates a new temporary file beside the one it is transferring. You'd need some kind of copy-on-write (which you'll need anyway for a writable FUSE) for in-place replacement. > It's just an idea at this point. The rsync protocol isn't > documented; File::RsyncP was developed by carefully reading the > rsync source. It's certainly possible to update File::RsyncP for > rsync 3.x, but the development and testing effort is relatively > high. Two benefits of using native rsync on the server side are > that a fuller set of command-line options could be used, and the > robustness would be better. One drawback is the rsync checksum > caching wouldn't work with FUSE. While it's tempting, you lose a lot of control (and therefore optimization opportunities) by letting native rsync do all the work and hiding BackupPC behind a file system layer which knows nothing of differential transfers, pooling etc... of course, a writable FUSE view would open a lot more usage scenarios like, for example, providing some "just drop your files here for backup" space in the network. Tino. -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de |
From: Nils B. (Lemonbit) <ni...@le...> - 2008-12-29 15:03:58
|
Jeffrey J. Kosowsky wrote: > I would be *very* interested in hearing your (informal) roadmap of > where you would like to take BackupPC (and where you wouldn't). > > As I have mentioned before, there is an almost endless amount of > extensions that could be added ranging from very modest tweaks to > whole new directions that would fundamentally transform BackupPC. It > would be good to hear your views and to have an active discussion > among users about what is the right balance of extensions and new > functionality. Maybe the mailinglist for development discussions might be of interest to you? See https://lists.sourceforge.net/lists/listinfo/backuppc-devel > Well, I think you have answered your own question in part. FUSE is and > has been standard on current kernels for probably 2 years now. So, any > recent distro will certainly have it included and I imagine by the > time 4.x is released that even "older" distros like CentOS and RHEL > will have it ;) Enterprise distributions are usually not too keen on adding new functionality to a release. I'm not sure if for instance Red Hat is planning on adding FUSE support to RHEL 4. Nils Breunese. |
From: Jeffrey J. K. <bac...@ko...> - 2008-12-29 15:36:29
|
Nils Breunese (Lemonbit) wrote at about 16:03:44 +0100 on Monday, December 29, 2008: > Jeffrey J. Kosowsky wrote: > > > I would be *very* interested in hearing your (informal) roadmap of > > where you would like to take BackupPC (and where you wouldn't). > > > > As I have mentioned before, there is an almost endless amount of > > extensions that could be added ranging from very modest tweaks to > > whole new directions that would fundamentally transform BackupPC. It > > would be good to hear your views and to have an active discussion > > among users about what is the right balance of extensions and new > > functionality. > > Maybe the mailinglist for development discussions might be of interest > to you? See https://lists.sourceforge.net/lists/listinfo/backuppc-devel I have been subscribed there for the past month or so and have seen very little traffic. Also, if we are talking about desired functionality and extensions at a high level, then it might be more interesting to include the broader user base to get their input. I would agree that detailed code-level roadmapping would be better discussed in the devel group. > > > Well, I think you have answered your own question in part. FUSE is and > > has been standard on current kernels for probably 2 years now. So, any > > recent distro will certainly have it included and I imagine by the > > time 4.x is released that even "older" distros like CentOS and RHEL > > will have it ;) > > Enterprise distributions are usually not too keen on adding new > functionality to a release. I'm not sure if for instance Red Hat is > planning on adding FUSE support to RHEL 4. > I was making a joke (see the ;) ) that by the time 4.x is released that all the then current and supported versions of CentOS and RHEL will all be including kernels that support fuse. Plus, as you yourself note, 3rd party repos often step in to fill the void and backport such functionality. |
From: Nils B. (Lemonbit) <ni...@le...> - 2008-12-29 15:20:11
|
Craig Barratt wrote: > One question I'm curious about: if FUSE becomes a required part of > BackupPC 4.x, does that unduly complicate installation or reduce the > number of distros that BackupPC can readily run on? I realize FUSE > is standard on recent 2.6.x kernels, but CentOS 5.2, as one example, > doesn't enable FUSE, and it was actually quite a pain installing it, > since the rpm package I found didn't install the kernel module. Dag's repository has dkms-fuse (kernel module) and fuse (userspace utilities) packages available for RHEL/CentOS. Nils Breunese. |
From: Les M. <les...@gm...> - 2008-12-29 18:35:41
|
Craig Barratt wrote: > > I actually believe having a FUSE implementation that supports writing > would be the best way to support rsync 3.x (and any other xfer methods > for that matter). Assuming the performance was ok, the time-reversed > delta format for storing backups that I'm planning for BackupPC 4.x > would be most easily implemented with FUSE. A slightly different approach would be to make an apache webdav module with all the backuppc-specific parts. An assortment of tools can access that directly and you can mount it with davfs (still needs fuse, though). There is an implementation like this that runs over a subversion repository for a similar effect. Subversion also has the delta storage format - unfortunately it doesn't have a reasonable way to delete anything from the repository. It would be nice if there were some way to map backuppc's storage into a versioning system, though, so you could use the versioning tools to see diffs over time or between the same files on different machines. -- Les Mikesell les...@gm... |