mbackup-devel Mailing List for Midnight Backup
Status: Alpha
Brought to you by:
jo2y
You can subscribe to this list here.
| 2000 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(16) |
Jun
(6) |
Jul
(5) |
Aug
(19) |
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(1) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(3) |
Feb
|
Mar
|
Apr
(7) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(2) |
| 2002 |
Jan
(4) |
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: jay s. <jay...@gm...> - 2009-11-16 14:14:44
|
Hi,
I have compiled db3.4 and tried to integrate mbackup0.6 with it on
POSIX environment.
While doing so I have encountered following error.
$ make
c++ -ggdb -shared -Wl,-soname,libmbackup.so -o libmbackup.so libmbackup.o
mb_obj
ect.o mb_index.o mb_index_file.o mb_index_db3.o mb_index_entry.o mb_data.o
mb_fi
le.o mb_disk_file.o mb_tape_drive.o mb_tape_file.o mb_filter.o
mb_dummy_filter.o
mb_directory.o -lc -ldb_cxx
mb_index_db3.o: In function `mb_index_db3':
*/home/Administrator/mbackup-0.6j/src/lib/mb_index_db3.cc:43: undefined
reference
to `Db::open(char const*, char const*, DBTYPE, unsigned int, int)'*
*/home/Administrator/mbackup-0.6j/src/lib/mb_index_db3.cc:50: undefined
reference
to `Db::open(char const*, char const*, DBTYPE, unsigned int, int)'*
*/home/Administrator/mbackup-0.6j/src/lib/mb_index_db3.cc:57: undefined
reference
to `Db::open(char const*, char const*, DBTYPE, unsigned int, int)'*
*/home/Administrator/mbackup-0.6j/src/lib/mb_index_db3.cc:64: undefined
reference
to `Db::open(char const*, char const*, DBTYPE, unsigned int, int)'*
*/home/Administrator/mbackup-0.6j/src/lib/mb_index_db3.cc:43: undefined
reference
to `Db::open(char const*, char const*, DBTYPE, unsigned int, int)'*
mb_index_db3.o:/home/Administrator/mbackup-0.6j/src/lib/mb_index_db3.cc:50:
more
undefined references to `Db::open(char const*, char const*, DBTYPE,
unsigned in
t, int)' follow
collect2: ld returned 1 exit status
make: *** [libmbackup.so] Error 1
Please suggest me necessary steps to be take as I'm completely bugged up,
trying all the possible ways.
Do I need to create database and if so how?
And the other thing i find is stack trace saying undefined reference which i
think is a linking error.
I have compiled the Database library in corresponding POSIX environment but
still it is not able to resolve the DB.
Your suggestions will mean a lot to me.
Thanks & Regards
|
|
From: Robert D. <ro...@nr...> - 2002-02-17 14:20:35
|
On Sat, 16 Feb 2002, James O'Kane wrote:
> I'm looking for brainstorm help to get an idea of the various things that
> people might want to query for, so that I can define the API for index
> modules. Since sql the language would be an implimentation detail, I don't
> want to have sql statements outside of the index module. That way, if
> someone wants to finish work on the db3 stuff, they don't have to
> impliment the sql language.
I missed the original question, but hopefully this is along the same lines...
I would like "named" backups.
I haven't used any commercial backup apps - the only one I have any real
experience with is AMANDA - but they all seem like they're host/file-based.
What I mean by that is a host and files on the host are specified to be backed
up, and then they are all tossed on the same tape. You restore by specifying
the same host and files.
I think a grouping abstraction layer would be very beneficial for management
purposes. The idea is to have named backups. The host/files groups would be
in a named sections. Let's say the DNS machine's disk catches on fire.
It was backed up with:
Group DNS
dns.machine.com /etc/named.conf
dns.machine.com /var/named
I can now restore DNS service to any machine with:
restore DNS from dns.machine.com to temp-dns.machine.com
Log in and start the service.
And to take this a step further..
Group DNS
file://dns.host.com/etc/named.conf
file://dns.host.com/var/named
Group pgsql
file://host.com/var/lib/pgsql
Group mysql
file://host.com/var/lib/mysql
Group mail
file://host.com/etc/mail
file://host.com/etc/sendmail.cf
file://host.com/var/spool/mail
Group host.com
Include DNS, pgsql, mysql, mail
file://host.com/
Here we include the other groups so we don't duplicate any backups and also
so we can do a full restore of host.com. Well, that's it for named backups,
but I'm sure some of you are curious about the url-style entries.
This is a bit off-topic on this thread, but I think the backup mechanism
should be able to handle multiple protocols for file retreival. Sometimes
a machine is accessible via only http or ftp. Or perhaps we want to backup
a windows share (smb://win.com/c/). In any case, the url-style specifies a
protocol, host, and file in one neat line.
--
Robert Dale
|
|
From: James O'K. <jo...@mi...> - 2002-02-17 04:45:02
|
Since a few people emailed me privately with similar comments, I'll reply to the group. I'm looking for brainstorm help to get an idea of the various things that people might want to query for, so that I can define the API for index modules. Since sql the language would be an implimentation detail, I don't want to have sql statements outside of the index module. That way, if someone wants to finish work on the db3 stuff, they don't have to impliment the sql language. Right now, I have mb_index::insert(mb_index_entry) mb_index::lookup(string filename) I could keep adding functions like lookup_directory_contents() and lookup_backup_set() and others, but I'm hoping to come up with a more generic interface. At the moment, I've been considering changing things to insert(mb_index_entry) lookup_exact(mb_index_entry) lookup_children(mb_index_entry) And you set the parts of mb_index_entry as appropriate, but I don't think that will work well. -james |
|
From: James O'K. <jo...@mi...> - 2002-02-16 21:01:31
|
I need help brainstorming the various ways that one might want to ask for information from the index. The goal is to keep all index implimentation details inside of that module. So no sql queries. Some that I can think of: * All files from a certain host. Sometimes with qualifiers such as before a certain date. * All files that are below a certain directory. eg /etc and subfiles and directories * All files on a certain tape. * All tapes that contain a group of files. Someone pointed me to sqlite which is a very slimmed down sql database. It works well for embedded apps because it doesnt' have client/server stuff. It saves everything to a file. http://www.hwaci.com/sw/sqlite/ I'm going to use this instead of db3 because it is much easier to work with, is smaller, and the sql schema and queries should work with other dbs is someone wanted to write a module. The db3 module that I wrote is still there, but it is out of date. -james |
|
From: James O'K. <jo...@mi...> - 2002-02-05 04:47:03
|
I finished getting a working restore system working, so I felt it was time for a checkpoint. This is the first release of mbackup since 0.5-threaded in late 2000. This version is a complete rewrite of the whole system in c++. This is still a developers only release as it still lacks several key features, however, all are invited to download the source and try it out. see README and doc/object-descriptions.txt in the tar file for more details. -james _______________________________________________ mbackup-announce mailing list mba...@li... https://lists.sourceforge.net/lists/listinfo/mbackup-announce |
|
From: James O'K. <jo...@mi...> - 2002-01-30 18:04:18
|
When I asked, I was already intent on using db3, I was mainly looking for some help/guidance on the most effective way to work with it. Almost all of my DB experience is in the sql world, so the lack of multiple fields in a table is a bit annoying. That said, I'm about 90% finished with a bare-bones working index class using db3. I hope to commit the last of my changes to cvs in the next few days or so. Instead of merging everything into one data value and storing that in one key, I've opted to create several tables, each one with a different bit of data (date, hostname, location where stored, etc) and a table that maps filename to a unique number that can be used for the key of the other tables. While, I don't think my implimentation is the best, it will work for now until someone else with more experience comes along and rewrites it. I mainly needed something that worked to test the index class API and to move forward with getting restores and such to work. My goal is to have a working restore in about 2 week depending on my work and class schedules. An interesting side note, I took the cvs source code, and did a test compile under cygwin in windows 2000, and aside from the lack of ftw(), having to compile db3 and linking everything statically. This compiled and worked. -james |
|
From: John H. <jo...@mw...> - 2002-01-30 06:48:18
|
Hello James, Its been a little while since you asked about db3. Whats your opinion now? Have you checked out tdb? Written by tridge, its supposedly 1000 lines of the most elegant code ever seen. Lacks a bit on secondary indexes though.. Regards John |
|
From: John H. <jo...@mw...> - 2002-01-07 04:22:11
|
Dear James, A long time ago we were talking about mbackup and such things as XML. I decided that mbackup wasn't going to be targetted at my needs. In the time honoured fashion of scratching an itch I have designed 'Darkserve' A nod in the direction of Midnight backup and a poke at Arcserve. Little has come of it, however I have produced code for testing the location facility. I originally tried Postgres, but could not get the performance. I recoded with Berkeley DB version 4. My test showed that with modest hardware, I can get 700,000 record inserted across several tables in about 10 sec. (IIRC) Anyway performance is EXCELLENT. version 4 has substantial advantages WRT secondary indexes. I had no troubles installing it under usr/local from the tarball. The documentation on sleepycats' site is superb. My current tarball is ftp.mwk.co.nz /pub/linux/darkserve-2002-01-07a.tar.gz ignore the project crap and use the files in src/location. The main file has a couple of test routines for populating and deleting test data. Documentation for the system is in the doc directory and it decribes how the system is supposed to work, including the location system. The SQL code is also relevant. The location system is designed for unlimited flexibility. The lookup table is where things happen and has lots of records. The location schema for a filesystem has 5 records in there per object backedup. Eg for /etc/samba/smb.conf the lookup tables will have. Label Value ObjID 'PATH' '/etc/samba' 'ID00001' 'FILE' 'smb.conf' 'ID00001' 'TYPE' 'F' 'ID00001' 'DATETIME' '2001-12-03T23:01:01' 'ID00001' 'SERVER' 'albatross.hisdad.org.nz' 'ID00001' So you see this table will get a _LOT_ of activity. My tests have satisfied me that DB V4 is a good choice. Everything there is copyright me and GPL'd, so feel free to fill your boots. Not however, that there is much code. I'm using the C library, but DB does have a C++ library, which looks equally well documented. I hope that you will find this helpfull Regards John |
|
From: James O'K. <jo...@mi...> - 2002-01-07 03:16:37
|
I just committed a bunch of code tonight. I've been working on getting an index to work, so one can find files on tapes, etc. SQL would be easier to work with, but I would prefer not to require it, so I'm working with db3. Does anyone have any experience programming with their api? I'm currently planning to do something like this: key/value --------- name/record-id record-id/date record-id/location record-id/... It seems possible to have many keys with different values in db3, so the name db would return multiple values that refer to different records that represent different backup runs. Does anyone have an experience with non-sql databases that could give me any feedback on this? thanks -james |
|
From: James O'K. <jo...@mi...> - 2001-12-24 00:39:46
|
I committed some big changes to cvs a week or so ago. Should I send out
email everytime I make a significate commit?
In this update, I rearranged the way that filters are added to the stream
of data. Before, one would have to do something like this:
file->add_filter(foo);
file->read(...);
tape->write(...);
Now, I've changed things so that filters act more like wrappers.
source = filter->add_stream(file);
destination = filter2->add_stream(tape);
source->read(...);
destination->write(...);
See src/base/main.cc to see a more detailed sample.
I currently debating which of the two would be better.
Should the tape keep track of where on the tape the data is, or should
something higher level just position the tape and the tape object just
blindly does it. To illustrate the two:
tape->restore("/foo");
tape->read(...);
And if the tape multiplexes files on the tape it will know where on the
tape each part starts and ends and rejoin them.
Or, at a higher level, where we do something like this. and the tape
module has less intelligence.
tape->seek(offset);
tape->read(...);
tape->seek(newoffset);
tape->read(...);
etc..
-james
|
|
From: James O'K. <jo...@mi...> - 2001-12-04 04:44:59
|
I've setup a new cvs repository for mbackup. I've written some quick and dirty directions for cvs checkout. http://cvs.midnightbackup.com/ The version that is in cvs is the result of a complete rewrite of mbackup in c++. I had planned on making this announcement a few days ago, but somethings came up. I've also had a idea on Saturday that will make things much cleaner, but that will require another partial rewrite. If you have time, please checkout the code from cvs and take a look. One note, that I don't think I've written in any of the sourcecode is that everything that is in mbackup/src/base/main.cc should probably be pushed into seperate object that represents a partition or filesystem, and it is not intended to represent my plans for main.cc. -james PS. I'm in San Diego this week for Usenix's LISA conference if anyone else is there. |
|
From: James O'K. <jo...@mi...> - 2001-10-24 03:19:57
|
On Sun, 8 Apr 2001, James O'Kane wrote: > Hi, > From my notes, the last traffic on this list was December 8th. A > large reason for that is because I started getting quick deadlines at work > and some things had to drop to the floor. This doesn't mean that I haven't > been thinking about mbackup. > > At work we've hired someone else to split the load with me, which > means until the load at work reaches the need for 3 people, I'll have more > time for coding and planning this project. It seems when you double your staff, the workload triples to compensate.. who knew? :) Anyone know someone who would contribute to a fund so I can work on this full time? :) -james |
|
From: Brent B. P. <po...@b2...> - 2001-04-18 15:30:00
|
>>>>> "James" == James O'Kane <jo...@mi...> writes: On further reflection, I snipped the whole message. You're apparently thinking of Module's as objects, which seems fine. You're still going to need (IMHO, of course), sockets in between particular modules. Named pipes aren't excessively portable, but sockets are (with the odd exception of Win32 and Win95, but a number of portability layers between winsocks and sockets exist). I'm not convinced by the idea of the MCM as CEO, or the rest of the analogy. At this stage of a project, the most important thing is to develop a definite, shared vocabulary, and another analogy won't do that. Suppose that the MCM constructor takes as input a 'list'. This list may be supplied either as a file name (in which case it looks much like a standard unix config file, or not, depending on content) or as a handle, to be read until closed, or as the name of an external process, to be executed and read (from stdout of the process) until finished. In either of the latter cases, the sum of the output will be identical to the former (in other words, where a manager expects a file name, it can be handed a file handle or an external process (such as the name of a perl or python script, or an executable, or whathaveyou) that can be executed. I hope that that's clearer than the way I phrased it before. I personally would try to have in mind that all of the module OK, so what does this thing look like? Well, that depends on what the underlying modules look like, and on what the flow of an archive section looks like. And there's a problem. Maybe you can sketch out the dataflow (as a strawman) of a standard archive operation? Say, an incremental file-by-file backup operation would be a good start. |
|
From: James O'K. <jo...@mi...> - 2001-04-17 18:23:53
|
On Tue, 17 Apr 2001, Brent B. Powers wrote: > I have the suspicion that modules mean different things, also. If I > understand correctly, the Master Control Module interprets one or more > script files, (or something similar), whereas the external > communication module is a glue and interface layer. By module, I mean something that is self contained and can be easily removed and replaced because the interface it uses is well defined. I thought of an analogy today for the different modules. Imagine that mbackup is a large factory. In this case, then the MCM would be the CEO, he decides what factory will be doing, when, and in what order. He knows the master plan. The RCM would be the person in charge of the purchasing dept. He has several people under him and when new materials for the factory are needed the MCM tells the RCM to get something, and the RCM delegates it to one of his workers. A similar thing happens with the FCM, They are the assembly line, the MCM tells them to process this new input from the RCM. The WCM is the shipping dept. Once we have the processed data, we need to ship it off somewhere. The Communcations manager is like the secretary or a PR person. If we need to contact another machine or process we do it through the ECM. > OK, so if modules don't need to create a tcp connection, how _do_ they > communicate? communicate with what? if it's another part of the same process it will be function calls. If it's another process on the same machine, they can use named pipes. What example case were you refering to? > Does the mcm ask this for the next file, or for the next file spec? I haven't decided exactly. I've been thinking that it will just ask for the next block of data and process things in blocks instead of files. I'm still pondering this, so input would be good. > I'd like to consider a couple of scenarios, and see how you think that > they might be handled. > > The obvious ones (any of these should go to disk, rom, tape, or ???): > 1.) Full Backup (file by file, all files) > 2.) Incremental backup (file by file, all files since a certain date) > 3.) Image backup (an entire file system) > > More difficult: > 1.) Tower of Hanoi backup (file by file, all files that have not been > backed up in their current version to at least n other data sets) > 2.) Distributed backup (multiple machines) See below. > Proprietary: > 1.) Suppose we had the ability to read a database partition while the > database was live? How would we trigger the program to read the data? This would be taken care of by a module under the RCM. > > Obviously, I've some things in mind. My conception of the master > control module is that it reads a file describing the backup. I think > that your system allows reader modules to handle file readers, image > readers, or remote readers, and the proprietary steps are a reasonable > extention. However, where is backup history being kept, as well as > media content lists? (i.e. if you know you've backed up a file, and > you're looking at a pile of tapes and cdroms, how do you find the file)? > > Now, suppose that you want to keep that information in a text file; > sometimes in a database; in a Sybase database for one set of clients, > in an Oracle database for another set of clients, and.... I think a > new module is required. Can we call it the database or history module? I've been thinking about this too. Lots of things will need this info, I just haven't decided how I want it to interact with the rest of the program. > Now, what makes the decisions as to _what_ is backed up? A file called > fspec exists: Or, maybe it does. How do we know? Is the reader module > responsible for determining that? If so, what has the responsibility > of determinging that the file fspec should actually be backed up? I've > not really figured this one out at all for the generic case. I did > come up with something like: a control module of some kind requests a > directory listing from a directory-reader. Based on the directory > listing, it compiles a list of items to back up. The master control > module takes that list, along with other lists that are included in > the same backup set, and arranges them. For each item, then, it > requests the data from a reader. My thoughts at the moment are that the MCM makes a high level decision, such as, it's time to start the backup, or I want to backup /dev/hda1. It's the RCM's job to look at /dev/hda1 and decide which files are to be backed up. If it need to contact the to be named indexing module, then it either does it directly, if it's part of the current process, or it asks the ecm to contact the index on it's behalf. > This scenario leads to some kind of reader architecture that allows > two modes of operation for each reader: directory and data. Note that > the directory itself may have to be backed up, so that, for some > readers, the directory mode might be a noop (for instance, a unix > partition reader (dump)); but for other readers, the data mode might > be a noop (think ldap-ily). > > This also leads (somehow) to the idea that the Master Control Module > might not want to simply parse a list, but be able to execute a > program or script in any situation where it would otherwise parse a > list. (I'm not sure if that sentence makes sense). I'm not sure either. > Finally, it is pretty clear that if there is a database module, > communication to the database module has to be granted to all of the > control modules, and some of the individual modules. For simplicity, > you might want to say all of the individual modules. All modules will have access to the module if needed, but it will be handled via ECM or the indexing module. The various modules don't need to know how the data is indexed or stored, it just wants to know the info it needs. I hope that answered some of your questions? -james |
|
From: Brent B. P. <po...@b2...> - 2001-04-17 17:22:38
|
>>>>> "James" == James O'Kane <jo...@mi...> writes:
Good to see a re-start up (I've gotten nothing done on docs for
reasons that I'm sure you can understand). I think you've made the
right decision as far as devel languages (to an extent....).
James> In this diagram, I've drawn out the major componants that I
James> would like to see. I should have probably added a block
James> called Indexer Module. I'll add that to the next
James> revision. I should also be a little clearer about what
James> those arrows mean because now that I'm looking at it again,
James> I realize that different arrows mean different things.
I have the suspicion that modules mean different things, also. If I
understand correctly, the Master Control Module interprets one or more
script files, (or something similar), whereas the external
communication module is a glue and interface layer.
James> Master Control Module This is the decision making
James> module. It tells other modules what to do, when to run, and
James> controld when data should be backed up.
James> External Communication Control Module When any part of the
James> process needs to talk to another process, it has this
James> module negotiate the connection for them. This includes
James> named pipes, sockets and any other communication, except
James> for disk reads and writes. The reason for everything to
James> flow through this one module is so that new network
James> excryption protocols can be dropped in without the rest of
James> the program needing to know, and so that we can multiplex
James> communication through one stream if desired. This also
James> relieves the rest of the modules from needing to know how
James> to create a tcp connection.
OK, so if modules don't need to create a tcp connection, how _do_ they
communicate?
James> Reader Control Module This is somewhat of a module that
James> controls other reader modules. The master control module
James> will ask this module for the next file to be processed. In
James> turn, this module will ask the ones it controls for some
James> data to be backed up. This will allow alternating reads
James> over different partions.
Does the mcm ask this for the next file, or for the next file spec?
James> Filter Control Module This one is similar to the Reader
James> Control in that it is the master of all the filter
James> modules. It is given some data and returns the filtered
James> data.
James> Writer Control Module As the name suggests, this one
James> controls where and how the data is written to disk or
James> tape. If it wanted, it could be able to spool data to a
James> holding disk and then to tape or span several tapes, or
James> multiplex several streams onto one tape.
James> Any thought so far? I'll wait until I heard some feedback
James> on this before I send more.
I'd like to consider a couple of scenarios, and see how you think that
they might be handled.
The obvious ones (any of these should go to disk, rom, tape, or ???):
1.) Full Backup (file by file, all files)
2.) Incremental backup (file by file, all files since a certain date)
3.) Image backup (an entire file system)
More difficult:
1.) Tower of Hanoi backup (file by file, all files that have not been
backed up in their current version to at least n other data sets)
2.) Distributed backup (multiple machines)
Proprietary:
1.) Suppose we had the ability to read a database partition while the
database was live? How would we trigger the program to read the data?
Obviously, I've some things in mind. My conception of the master
control module is that it reads a file describing the backup. I think
that your system allows reader modules to handle file readers, image
readers, or remote readers, and the proprietary steps are a reasonable
extention. However, where is backup history being kept, as well as
media content lists? (i.e. if you know you've backed up a file, and
you're looking at a pile of tapes and cdroms, how do you find the file)?
Now, suppose that you want to keep that information in a text file;
sometimes in a database; in a Sybase database for one set of clients,
in an Oracle database for another set of clients, and.... I think a
new module is required. Can we call it the database or history module?
Now, what makes the decisions as to _what_ is backed up? A file called
fspec exists: Or, maybe it does. How do we know? Is the reader module
responsible for determining that? If so, what has the responsibility
of determinging that the file fspec should actually be backed up? I've
not really figured this one out at all for the generic case. I did
come up with something like: a control module of some kind requests a
directory listing from a directory-reader. Based on the directory
listing, it compiles a list of items to back up. The master control
module takes that list, along with other lists that are included in
the same backup set, and arranges them. For each item, then, it
requests the data from a reader.
This scenario leads to some kind of reader architecture that allows
two modes of operation for each reader: directory and data. Note that
the directory itself may have to be backed up, so that, for some
readers, the directory mode might be a noop (for instance, a unix
partition reader (dump)); but for other readers, the data mode might
be a noop (think ldap-ily).
This also leads (somehow) to the idea that the Master Control Module
might not want to simply parse a list, but be able to execute a
program or script in any situation where it would otherwise parse a
list. (I'm not sure if that sentence makes sense).
Finally, it is pretty clear that if there is a database module,
communication to the database module has to be granted to all of the
control modules, and some of the individual modules. For simplicity,
you might want to say all of the individual modules.
I'll look forward to your response.
|
|
From: Robert D. <ro...@nr...> - 2001-04-15 03:55:39
|
I'm a fan of putting all source code in a directory called 'src/'
Then you've got your non-source above that...
docs/
contrib/
src/
include/
lib/
.../
--
Rob
On Sat, 14 Apr 2001, James O'Kane wrote:
> /base
> This directory will contain the very core of the suite. main()
> will be in here.
>
> /communicators
> This is for modules whose job it is to contact other processes,
> other than mbackup
>
>
> /controllers
>
>
> /docs
> Documentation.
>
> /filters
> The modules that do the actual work on data.
>
> /include
> header files (* see note below)
>
> /lib
> library files
>
> /readers
> Modules that read from media
>
> /tests
> Regression tests
>
> /writers
> Modules that write to media
|
|
From: James O'K. <jo...@mi...> - 2001-04-15 01:38:59
|
ftp://mbackup.sourceforge.net/pub/mbackup/mbackup-newlayout.tar.gz That tarball is my proposed source layout. /base This directory will contain the very core of the suite. main() will be in here. /communicators This is for modules whose job it is to contact other processes, other than mbackup /controllers /docs Documentation. /filters The modules that do the actual work on data. /include header files (* see note below) /lib library files /readers Modules that read from media /tests Regression tests /writers Modules that write to media * In the /include directory, I've thrown together an example of my ideas for how cross-platorm issues will be handled. In general code, I would like everything to make calls to mb_foo() where foo is a standard C function. So we would call mb_open() to open a file. For posix, it's easy, mb_open() is just a #define for open(). On other systems, it would be a wrapper function. Could I have some comments from people who have done large projects or people who have done cross platform work before? -james |
|
From: James O'K. <jo...@mi...> - 2001-04-08 23:20:53
|
http://www.jamesokane.com/mbackup-drawings/mbackup-internalcommunication-draft1.eps In this diagram, I've drawn out the major componants that I would like to see. I should have probably added a block called Indexer Module. I'll add that to the next revision. I should also be a little clearer about what those arrows mean because now that I'm looking at it again, I realize that different arrows mean different things. Master Control Module This is the decision making module. It tells other modules what to do, when to run, and controld when data should be backed up. External Communication Control Module When any part of the process needs to talk to another process, it has this module negotiate the connection for them. This includes named pipes, sockets and any other communication, except for disk reads and writes. The reason for everything to flow through this one module is so that new network excryption protocols can be dropped in without the rest of the program needing to know, and so that we can multiplex communication through one stream if desired. This also relieves the rest of the modules from needing to know how to create a tcp connection. Reader Control Module This is somewhat of a module that controls other reader modules. The master control module will ask this module for the next file to be processed. In turn, this module will ask the ones it controls for some data to be backed up. This will allow alternating reads over different partions. Filter Control Module This one is similar to the Reader Control in that it is the master of all the filter modules. It is given some data and returns the filtered data. Writer Control Module As the name suggests, this one controls where and how the data is written to disk or tape. If it wanted, it could be able to spool data to a holding disk and then to tape or span several tapes, or multiplex several streams onto one tape. Any thought so far? I'll wait until I heard some feedback on this before I send more. -james |
|
From: James O'K. <jo...@mi...> - 2001-04-08 23:02:53
|
Hi, From my notes, the last traffic on this list was December 8th. A large reason for that is because I started getting quick deadlines at work and some things had to drop to the floor. This doesn't mean that I haven't been thinking about mbackup. At work we've hired someone else to split the load with me, which means until the load at work reaches the need for 3 people, I'll have more time for coding and planning this project. The last topic we were discussing was the possibility of writting this in java. The several people who suggested that I don't use java I feel were right, but I would like this to be written OO, so that means C++. I'm going to scrap the code that I've already written and reimpliment it with a new layout which I'll be describing in my next few messages. To start off, a few words about why write mbackup. At work we were evaluating several commercial offerings as I'm sure several of you have, and I was disappointed that some offered some features and other offered different ones and I wanted them all. I decided to take the approach that apache uses where they define a module API and modules can register their interest in different phases of the http request. That's what I would like mbackup to do. In stead of the term backup application, I would like mbackup to be a 'framework for building custom backup solutions'. It sound kinda buzzwordy, but here is what I mean. In a config file, people can specify modules that are in charge of reading from disk, manipulating data, and writing data. Because we are abstracting things, it should be easy for someone to add custom modules for their project. They might want/need to filter files that are older than a certain time and owned by a certain user. Or only backup files that were created on Sundays. I've also had an idea of taking a kernel patch from SGI that notifies applications when inodes change, and use that to schedule files to be backed up shortly after they were created or modified. I have an idea for a project I would like to do that offers over the internet backup similar to www.backup.com, but cover real OSes such as linux and *bsd. That has different requirements than a small home user, or a middle size office. I think it is possible to create an application that can be adapted via modules to do all of those. Why should you help? Because you have a need for a similar piece of software possibly? You're bored? Either way, I'm going to try and create rewards for people who contribute. My first thought would be printing custom tshirts, any other ideas? I've done some block diagrams that help explain my ideas. I'll start sending them with descriptions in my next few mailings. -james |
|
From: Robert D. <ro...@nr...> - 2001-01-09 20:06:45
|
On Tue, 9 Jan 2001, Robert Dale wrote: > http://www.sunworld.com/unixinsideronline/swol-08-2000/swol-0811-remote-2.html Maybe you should start with page 1... http://www.sunworld.com/sunworldonline/swol-08-2000/swol-0811-remote.html -- Robert Dale |
|
From: Robert D. <ro...@nr...> - 2001-01-09 20:02:55
|
http://www.sunworld.com/unixinsideronline/swol-08-2000/swol-0811-remote-2.html -- Robert Dale |
|
From: Robert D. <ro...@nr...> - 2001-01-08 20:41:02
|
What's up with mbackup? -- Robert Dale |
|
From: James O'K. <jo...@mi...> - 2000-12-07 02:47:57
|
I've been considering moving from C to Java. The first thing I can think of that people will complain about is that java is slow, and from what I've seen it's not too bad on long running process that use the JIT compiler like hotspot. And graphics are slow, but we wont' be doing many graphics, so given that we'll already be IO bound, I don't think speed will be an issue for this. The big things that Java will get us is easier cross platform coding, easier network coding, easier encryption coding, less chance of memory leaks and no pointers to screw up. One thing that will be a bit difficult is java tries to be a common ground, so it leaves out things like stat(). What I need to know from someone is what metadata that different filesystems have that is different from posix. For example, BeOS keeps arbitrary number of attributes about a file such as it's mime type and the To: and From: if it's an email for example. I'd like to try and make a class that covers all possibly filesystems. I guess the ones in question are BeOS, Novell, and Windows 2000/NT/98. I also need to know about filesystem attributes like the windows registry that need special care in backing up. Don't worry if you were dieing to write in C/C++, I'm sure that there will still be some code to be done.. :) And I'm also not 100% sure I want to make this change or not. -james |
|
From: James O'K. <jo...@mi...> - 2000-11-09 03:01:49
|
On Wed, 8 Nov 2000, Brent B. Powers wrote: > 1.) Docs? Even a start at docs? > 2.) Specs? Even a start at specs? Sort of. There is this, but not much else. https://sourceforge.net/docman/display_doc.php?docid=19&group_id=1747 Documentation isn't my strong suit. :) > 3.) How is a particular file selected for backup? Personally, I would > hope that this would be a module, so that one could implement a > 'tower-of-hanoi' structure, or a 'full' structure, or a 'new' > structure, via a relatively simple module, or, in the case of the > first, a somewhat complex module. The plan is to leave this up to the first few modules in the filter list. I'm not completely happy with what I have so far, but it works. Currently, If a filter is passed an empty file object, it can fill that object in with a file that it has decided needs to be backed up. When that filter has backed up everything it plans on, it sets some flags, when an empty object is passed to every filter, and they all say they are done, then the backup is complete. The reason for doing it this way, was so that modules can be smaller logic wise. One filter could backup 'normal' parts of the file system and exclude the directories that have an sql db running. Then a different module can backup the sql db. > > 4.) MySQL? Not to be rude, but, oh please. OK, just what is the DB > used for? [[ And, of course, why isn't it again modularized, so that I > could at least take advantage of the Sybase and Oracle servers that I > have? ]] It is modularized, but the makefile still tries to build it. The db is used to create an index of where files have been stored so that when it's time to do a restore, the db can be queried, and we can go straight to the location of the file. I used MySQL because that's the one I already have running on my machine and the one I'm most familiar with. For the past version, I've had that module commented out because it didn't relate to what I was testing. This also brings up the point that I should sit down with a make and autoconf book and get those parts right. > 5.) Speaking of which, has any thought been given to how to backup a > live db server? This might require an NDA or some sort of agreement > with the vendor... From what I read, most of them have tools to back those up, but I haven't looked at it further. Since it would be a module, they can have whatever license they need to. > 6.) Which inexorably leads to: How are the modules organized? How does > the driver program 'know' what module to call, in what order to call > them, what the purpose of a particular module is, and where that > module might be found? It reads the list of filters and other modules from client.conf. The filters are applied in the order they are listed in that file. The crude format of client.conf is 'keyword value' so filter modules start with filter foo.so. where foo.so would need to be the path to the file. For testing, I've been keeping everything in the current direcetory. > 7.) Where do modules run? Where does 'the program' run? If, for > instance, over the top, I have a tape drive connected to the scsi port > of a sparc-linux box, does the mbackup server run on that machine? How > could that tape drive be used to backup the files of a network > attached Win2500 system? The plan is to have a client and a server running on the respective machines. I haven't put any time into talking to tape drives yet, because I need to get some of the other infrastructure code working first. I don't have access to a Win2500, but I would guess there is a way to talk to it via an API? A module would need to be written that spoke to that API. > 8.) Finally, what are the plans for a.) archiving onto multiple backup > media; or b.) archiving multiple backup sets onto a single backup > medium; and finally, c.) Juke boxes ? For A., that's one of the reasons for the sql db, so we know which tape the data actually was written to. At the moment, I don't have that type of hardware, but I might soon. For B., I'm hoping for this, since I treat each file as a seperate object and I recently added threads, it should happen easily. > I realize that this is a long list of questions, but I can probably > start slogging some spare cycles into this relatively quickly, at > least until the end of the year. > -james |
|
From: Brent B. P. <mb...@b2...> - 2000-11-08 22:47:03
|
Greetings. As I was getting frustrated with backup the other night, I sat down and started drawing up plans for yet another gpl backup system. Luckily (I hope), I took a look-see at source forge and found this project before I got too deep, including too deep into some of the issues (which may well be reflected below). Thus, before I sign up as a developer, the big and obvious questions: 1.) Docs? Even a start at docs? 2.) Specs? Even a start at specs? 3.) How is a particular file selected for backup? Personally, I would hope that this would be a module, so that one could implement a 'tower-of-hanoi' structure, or a 'full' structure, or a 'new' structure, via a relatively simple module, or, in the case of the first, a somewhat complex module. 4.) MySQL? Not to be rude, but, oh please. OK, just what is the DB used for? [[ And, of course, why isn't it again modularized, so that I could at least take advantage of the Sybase and Oracle servers that I have? ]] 5.) Speaking of which, has any thought been given to how to backup a live db server? This might require an NDA or some sort of agreement with the vendor... 6.) Which inexorably leads to: How are the modules organized? How does the driver program 'know' what module to call, in what order to call them, what the purpose of a particular module is, and where that module might be found? 7.) Where do modules run? Where does 'the program' run? If, for instance, over the top, I have a tape drive connected to the scsi port of a sparc-linux box, does the mbackup server run on that machine? How could that tape drive be used to backup the files of a network attached Win2500 system? 8.) Finally, what are the plans for a.) archiving onto multiple backup media; or b.) archiving multiple backup sets onto a single backup medium; and finally, c.) Juke boxes ? I realize that this is a long list of questions, but I can probably start slogging some spare cycles into this relatively quickly, at least until the end of the year. Cheers. |