You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(20) |
Aug
(4) |
Sep
|
Oct
|
Nov
|
Dec
|
|---|
|
From: Eric L. G. <er...@ba...> - 2001-08-02 06:49:01
|
From the Ruby book:
"Ruby comes with a debugger, which is conveniently built into the base
system. You can run the debugger by invoking the interpreter with the -r
debug option, along with any other Ruby options and the name of your
script:
ruby -r debug [options] [programfile] [arguments]
"The debugger supports the usual range of features you'd expect, including
the ability to set breakpoints, to step into and step over method calls,
and to display stack frames and variables."
Okay, so I don't prefer to use 'gdb' nowdays, but it's still better than
no debugger at all :-).
--
Eric Lee Green Web: http://www.badtux.org
GnuPG public key at http://badtux.org/eric/eric.gpg
Free Dmitry Sklyarov! [ http://www.eff.org ]
|
|
From: Eric L. G. <er...@ba...> - 2001-08-02 04:27:32
|
On Wed, 1 Aug 2001, Richard Fish wrote:
> On Tue, 31 Jul 2001, The Unknown Hacker wrote:
> > I'd like to make the server side be in a high level language (which C++
> >isn't).
> > Java is a possibility -- but it takes forever and a half to start up a Java
> > program. That's what happens from carrying around 40mb of bloated runtime
> > environment.
>
> I agree about the run-time start-up problem for Java. I've been playing
> around a bit with Forte, and I can't believe how long it takes to start
> the damn thing on my dual-PIII RAID box.
>
> However, I have grave concerns about switching languages here. Call me
> blasphemous, but I would rather do the whole thing in C++ than a
> less-capable scripting language. Let me explain:
"less-capable"? Are we talking Python? or Ruby? (The two are virtually
the same language, aside from strings and integers being objects in
Ruby and built-in classes being subclassable in Ruby, and some syntax
differences that aren't particularly difficult to learn).
> Granted, Python saved us a lot of time in developing BRU-Pro, due to being
> less verbose, and having a rich set of data management classes. But it
> also cost us some time, due to a lack of a decent debugging environment
> (which is now resolved), dynamic names (how many times did the program
> fail at run-time due to a name error? or because we forgot an import), and
> dynamic types (1 != "1", anybody?).
My concern is with getting something working in a reasonable amount of
time. I write about the same amount of code per day whether I'm
working in C++, Python, or Java. The C++ code, on the other hand,
takes far more lines of code -- about 90 lines in the case of that
block accept thingy to do something that can be done in a dozen lines
of Python/Ruby or in two dozen lines of Java. In addition, C++ is
prone to problems like memory leaks that are irritating.
> To me, Java is a bit more verbose than Python, but brings with it static
> types, compiler checks, and excellent development and debugging tools.
> This is a very good thing!
You have obviously not debugged Java Server Pages :-).
> I am really concerned about integrating a large portion of this thing in a
> language that doesn't have a large user base, static types, and a solid
> debugging environment. Ruby seems to fail on all 3 points.
Large user base: Ruby has one, but most of them don't speak English
:-). Really, Ruby is so easy to learn (especially for Python
programmers -- have I mentioned how similar the two languages are?)
that it's not a problem.
Static types: We have a difference of opinion here. In my opinion,
strict static typing is the main reason C++ is so complex. I'll point
out that while Java variables do have a static type, this does not
apply to the contents of container objects. This is one reason why
using Java's container classes is so much simpler than C++'s. For
example, a Vector can hold a string for its first element, a class
instance for its second element, and an integer for its third
element. And yes, I've already run into a problem where I accidentally
stuck an integer into my Vector when I meant to put the string value
instead! Java doesn't have "templates" like C++, and my reaction to
that is "Thank God!". Java defines things like Vectors in terms of
references to things of type Object (the base type of all Java
classes), and you have to cast your data reference to/from type Object
to put it into/take it out of one of the Java container classes
(though in most cases that's done automatically for you). Like with
Python, data in Java lives as objects on the Heap, all that is in your
variables is a reference to the actual object (just like Python).
Have I mentioned that I like Java a whole lot better than I like C++?
:-).
As for solid debugging environment, you have to pay for one that works
for Java Server Pages. I frankly don't have any inclination to do so.
Debugging Java Server Pages is frankly almost identical to debugging
Python -- same problem with runtime errors when you try to compare 0 with
"0", for example (easy to do if you're pulling something out of a Hash or
Vector and you put it in there without first doing a string->int conversion
or vice-versa!).
> So, if we can't live with Java's startup time, I think our only real
> alternative is to do it in pure C++. But that doesn't mean we need to
> abandon Java however.
Uhm, the problem was that certain things that would be better done in
Java cannot be done there because of the startup time. This does not
mean that the whole thing should (or *could*, for that matter) be written
in C++. In particular, C++ has the following problems that Java (and
high level languages in general) do not have:
1. Memory leaks.
2. Buffer overflows.
3. Machine dependence (Java, Ruby, etc. are pretty close to 'write once
run anywhere', c++ is lots of use of GNU Autoconf on Unix platforms, and
god only knows what on Windows),
4. Lack of an extensive class library (e.g. setting up a TCP server in
Ruby or Python is a matter of instantiating a TCP server instance,
in Java it's a few dozen lines of code instantiating a service
object, whereas in C++, it's hundreds of lines of custom coding).
#2 is especially irritating, because it basically means that all C++
programs are security risks.
> On the wxWindows side, I would be willing to do our entire interface
> (i.e., the client) in C++ with wxWindows. It is missing a couple of
> useful controls, but that will probably be resolved by the wxUniversal
I have no problem with wxWindows if we choose not to use Java for the
server. If we do use Java for the server, it makes sense to use Java
for the interface too, using the SWING classes. These are widely
derided as being slow, but they do provide most modern functionality
(unlike the older AWT classes).
> project. Note that C++ RPC from Windows to Unix does work, with the
> correct library. QuincyStreet has a working RPC implementation from
> Windows to Solaris, that I'm certain we could use. And if we need more
> than just RPC, we could always use CORBA.
The problem with RPC/CORBA is that they were designed prior to the relaxing
of crypto exports, and thus are not as secure as they could/should be.
Java's built-in CORBA-like mechanism can be told to use SSL, and Java does
have an SSL implementation. Or Java is fast enough nowdays, with the JIT
runtimes that are now extant, that we could roll our own, but there should
be little need for that. We'll find out as we get closer to implementation
time there.
> In other news, I am close to getting my TCString class done, and a
> document about internationalization. There really isn't much to the
> document, just saying here are the interfaces and defines for i18n, and
> the actual implementation will be put off until later.
For internationalization, Java does internationalization naturally
(via doing everything as Unicode internally). I suggest that all C++
modules emit key/substitution value tuples rather than user-readable
messages, and let Java do the actual conversion to English (or
Japanese, or ...).
For example:
com.key_error:com.key_error.expired_key:34812a5|05|12|2001|15|35\n
could expand to:
Communications Key Error: Key 34812a5 expired on 05/12/2001.
And the
com.en.US.properties file has the following in it:
com.key_error=Communications Key Error
com.key_error.expired_key=Key ${0} expired on ${1}/${2}/${3}
And the
com.en.UK.properties file has the following in it:
com.key_error=Communications Key Error
com.key_error.expired_key=Key ${0} expired on ${2}/${1}/${3}
(Note different date format!).
The C++ programs thus don't need to know anything about how to emit
Japanese or Russian or whatever, they just emit the keys and parameters
that will be substituted using the Java localization features.
In any event: My frustration is with C++. The language is a bear. I
would not want to go down to C though, the STL classes are just too
darned useful. But I also would not want to be writing more C++ than
absolutely necessary. Hell, I would write the whole thing as shell
scripts calling 'awk', 'sed', 'md5sum', 'dd' and 'od' if I could get
away with it :-). (Laugh all you want, but did I tell you about the
guy I knew who wrote a compiler using shell, awk, and sed? Took forever
to compile anything, but it worked!).
Ruby is better than resorting to shell scripting, and is more readable
and maintainable than Perl. That was why I was interested in Ruby.
Especially when virtually the whole agent side could be implemented in
a couple hundred lines of Ruby, and I'm already past that in lines of
C++ without having a completed packet class yet :-}. But frustration with
C++ is certainly no reason to decide to do the whole project in C++!
--
Eric Lee Green Web: http://www.badtux.org
GnuPG public key at http://badtux.org/eric/eric.gpg
Free Dmitry Sklyarov! [ http://www.eff.org ]
|
|
From: Richard F. <rj...@fi...> - 2001-08-01 16:47:03
|
On Tue, 31 Jul 2001, The Unknown Hacker wrote: > I'd like to make the server side be in a high level language (which C++ isn't). > Java is a possibility -- but it takes forever and a half to start up a Java > program. That's what happens from carrying around 40mb of bloated runtime > environment. I agree about the run-time start-up problem for Java. I've been playing around a bit with Forte, and I can't believe how long it takes to start the damn thing on my dual-PIII RAID box. However, I have grave concerns about switching languages here. Call me blasphemous, but I would rather do the whole thing in C++ than a less-capable scripting language. Let me explain: Granted, Python saved us a lot of time in developing BRU-Pro, due to being less verbose, and having a rich set of data management classes. But it also cost us some time, due to a lack of a decent debugging environment (which is now resolved), dynamic names (how many times did the program fail at run-time due to a name error? or because we forgot an import), and dynamic types (1 != "1", anybody?). To me, Java is a bit more verbose than Python, but brings with it static types, compiler checks, and excellent development and debugging tools. This is a very good thing! I am really concerned about integrating a large portion of this thing in a language that doesn't have a large user base, static types, and a solid debugging environment. Ruby seems to fail on all 3 points. So, if we can't live with Java's startup time, I think our only real alternative is to do it in pure C++. But that doesn't mean we need to abandon Java however. The other benefit of Java is that the syntax is close enough to C++ that we can prototype things in Java to get them running, and let others who come later convert modules into C++. This would be good way to let new developers get their feet wet in the project. I can see a time where there are a lot of Java modules, but not a Java bytecode to be found. On the wxWindows side, I would be willing to do our entire interface (i.e., the client) in C++ with wxWindows. It is missing a couple of useful controls, but that will probably be resolved by the wxUniversal project. Note that C++ RPC from Windows to Unix does work, with the correct library. QuincyStreet has a working RPC implementation from Windows to Solaris, that I'm certain we could use. And if we need more than just RPC, we could always use CORBA. In other news, I am close to getting my TCString class done, and a document about internationalization. There really isn't much to the document, just saying here are the interfaces and defines for i18n, and the actual implementation will be put off until later. An interesting note is that the current GNU (as of gcc 3.0) stdc++ library doesn't support wstring, wcout, wcerr, wfstream, etc. It's a part of the C++ spec, but they haven't gotten that far yet. Would be a good project for me to get involved in, if I had time!! I should have these things ready to upload by Saturday. -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: The U. H. <er...@ba...> - 2001-08-01 04:02:26
|
Still plugging away on the network transport. At the moment I'm plugging away at the client side of the certificate/session key service. An issue of sorts has come up. The server side of the certificate chat really doesn't want to be C++. C++ is just too %!@@## verbose -- e.g. the read-a-block-from-network routine that would take 15 lines of Python is around 90 lines of code. I still agree with the choice of using C++ -- e.g., if I did not have C++ strings, I'd need to write a block "class" to handle chunks of data that are being encrypted/decrypted, whereas with C++ strings I can "just do it" without worrying about NUL characters in my strings -- but I have no intention of writing huge amounts of code in C++ if it's avoidable. I'd like to make the server side be in a high level language (which C++ isn't). Java is a possibility -- but it takes forever and a half to start up a Java program. That's what happens from carrying around 40mb of bloated runtime environment. I looked at Perl. Ick. I looked at Perl again, it's just so darned useful, but ick. Python, I can't use, we know why. I looked at Ruby. There's a new book out. The book is available online. Interesting. I'm going to look at this closer. Ruby isn't "popular", but if the choices are Perl or Java, it may be worthwhile. I like Java, I've been doing a lot of programming in the language over the past two months, but the size, the platform limits (it's really only supported on Linux, Windows, and Solaris), and the sheer *size* of the JRE is terrible. 40 megabytes of bloated runtime environment? Hell, the whole of BRU-Pro was less than that! I looked at GUI toolkits also, for those of us afflicted with Windows. WxWindows appears to be a rather bloated toolkit but does work both on Windows and Unix with reasonable look and feel. FOX (http://www.cfdrc.com/FOX/fox.html) looks interesting -- it appears to be somewhat similar to QT in its functionality and basic philosophy, but without the licensing difficulties. It appears to have the widgets we'd need, but somebody should probably take a look at it. Maybe someone doing Windows programming who wants to use something that has a real layout manager :-). I'm not saying that we should dump Java (though this is the time if we are going to do so), just that Java is turning out to have some real downsides, such as forcing more code into C++ due to the enormous amount of time it takes for the Java runtime environment to start up. There's a reason why virtually all Java programs are multi-threaded rather than multi-process. Unfortunately, our componetized dataflow architecture isn't well suited for multi-threading. This may be the right time to investigate alternatives. -- Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg mailto:er...@ba... Web: http://www.badtux.org "Emacs is a nice OS, but a weird editor." -- M.J. Blom |
|
From: Eric L. G. <er...@ba...> - 2001-07-30 07:36:53
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sunday 29 July 2001 15:09, you wrote: > > Sorry, that always succeeds. The glibc library does have a 'setlogin' > > function in it. It just always fails (returns -1) when called, as well as > > printing out a nifty warning at compile time. I will put the above into > > my > > Which is why you have to do a AC_TRY_RUN to detect the run-time failure. > When I run a configure script with it, I prints out: > > checking for working setlogin... no > > So, it does appear to work correctly. Ah. Okay. I've tested it on both OpenBSD and Linux and you're right, it does work. Thanks. - -- Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg mailto:er...@ba... Web: http://www.badtux.org "Emacs is a nice OS, but a weird editor." -- M.J. Blom -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE7ZQ8K3DrrK1kMA04RAgB/AKCKFP9RzrkOgr4p5NUD/F4v3FEHBACfQ5Es V+aeo86GlYUeONbob5m/81g= =jOf6 -----END PGP SIGNATURE----- |
|
From: Richard F. <rj...@fi...> - 2001-07-29 22:09:39
|
> Sorry, that always succeeds. The glibc library does have a 'setlogin' > function in it. It just always fails (returns -1) when called, as well as > printing out a nifty warning at compile time. I will put the above into my Which is why you have to do a AC_TRY_RUN to detect the run-time failure. When I run a configure script with it, I prints out: checking for working setlogin... no So, it does appear to work correctly. -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: Eric L. G. <er...@ba...> - 2001-07-29 02:08:47
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Sat, 28 Jul 2001, Richard Fish wrote:
> Does anybody have any clue why the glibc folks define functions in the
> headers that are guaranteed to fail? I don't....
>
> Anyway, this should work:
>
> AC_CACHE_CHECK(for working setlogin, tc_cv_have_setlogin, [
> AC_TRY_RUN([
> #include <unistd.h>
> int main() { return setlogin("fooey"); }
> ], tc_cv_have_setlogin="yes", tc_cv_have_setlogin="no", tc_cv_have_setlogin="yes")
> ])
>
> if test ${tc_cv_have_setlogin} = "yes"; then
> AC_DEFINE(HAVE_SETLOGIN)
> fi
Sorry, that always succeeds. The glibc library does have a 'setlogin'
function in it. It just always fails (returns -1) when called, as well as
printing out a nifty warning at compile time. I will put the above into my
configure.in and put the appropriate #ifs, but it won't detect the
glibc situation.
All I have to say is that glibc is wrong. If the function is not
implemented, then don't include it, darn it! But, I have to deal
with whatever crapware Red Hat slung onto a disk, sigh :-(.
Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg
mailto:er...@ba... Web: http://www.badtux.org
"Emacs is a nice OS, but a weird editor." -- M.J. Blom
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7Y2y63DrrK1kMA04RAv4IAJ4tqV7dDGRlq5iOANeKNJJ+mPHcTgCffQId
V1e39PMkFiTPAoWkdfx65fQ=
=MMnP
-----END PGP SIGNATURE-----
|
|
From: Richard F. <rj...@fi...> - 2001-07-28 19:10:30
|
Grrr....
Does anybody have any clue why the glibc folks define functions in the
headers that are guaranteed to fail? I don't....
Anyway, this should work:
AC_CACHE_CHECK(for working setlogin, tc_cv_have_setlogin, [
AC_TRY_RUN([
#include <unistd.h>
int main() { return setlogin("fooey"); }
], tc_cv_have_setlogin="yes", tc_cv_have_setlogin="no", tc_cv_have_setlogin="yes")
])
if test ${tc_cv_have_setlogin} = "yes"; then
AC_DEFINE(HAVE_SETLOGIN)
fi
You might want to change the tc_cv names to our standard (I chose tc_ for
"Tapioca Config"). Note that the default for cross-compiling is "yes", so
a cross-compile to a linux target will get things wrong. So you probably
still want to do the setenv, and then add the setlogin call if
HAVE_SETLOGIN is defined.
On Fri, 27 Jul 2001, Eric Lee Green wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Okay, here's the deal. On *BSD, setlogin(pwd->pwname) sets the login name
> on my forked process (right before I exec, sigh). On Linux,
> setenv("LOGNAME",pwd->pw_name,1) sets the login name. Easy enough to
> detect, just put a check for whether setlogin exists in the standard
> libraries, right? Nope, WRONG! setlogin *does* exist in glibc, it just
> always fails!
>
> Okay, anybody, how do I write an autoconf test for something that
> exists, but always fails?
>
> At the moment I just call both of the above, but I'm tired of the
> compiler bitching at me!
>
> Regarding the plumbing, I have reworked somewhat how this is going to work
> in order to use more of other people's code. In particular, I swiped
> inetd.c from OpenBSD and am using it to create some services using the
> standard inetd mechanism, rather than creating my own server. This means I
> will use more ports, but the individual services will be easier to write
> because they will not have to actually handle accepting connections, and
> will be easier to debug because I can debug them from the command line
> rather than having to go over the network to debug them (I remember what a
> pain in the $#%@ that is!). In addition, thanks to tcpwrappers, I can
> make it obey the hosts.allow and hosts.deny files, which many people will
> like, all without me writing any code (though I #ifdef'ed all the RPC
> code out).
>
> tapinetd comes in at a sleek 20k right now, so this isn't exactly pork,
> even if it is overkill for what we need. I use tapinetd rather than the
> system-provided inetd because a) some systems don't have inetd (e.g., Red
> Hat has moved to vinetd), b) I want to keep things in
> /etc/tapioca/tapinetd.conf rather than /etc/inetd.conf, and c) it makes it
> easier to make a boot/root disk than having to copy inetd and inetd.conf
> off the system (not to mention that the system inetd will not be as sleek,
> because it has stuff in it that I've #ifdef'ed out in tapinetd!).
>
> Right now I'm using two ports -- a keymanager port, and an exec port. Call
> the exec port without a proper session key obtained from the key manager,
> and you get bitched at. Or will get bitched at, once I get the key manager
> going :-).
>
> Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg
> mailto:er...@ba... Web: http://www.badtux.org
>
> "Emacs is a nice OS, but a weird editor." -- M.J. Blom
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.0.6 (GNU/Linux)
> Comment: For info see http://www.gnupg.org
>
> iD8DBQE7Yjg83DrrK1kMA04RAuWmAKCJW9VpbBxZjxU9aNcE5f3LlKoWRQCeIgDV
> /b8o0rYfzL7tog3k1fuaOS0=
> =/iC4
> -----END PGP SIGNATURE-----
>
>
> _______________________________________________
> Tapioca-devel mailing list
> Tap...@li...
> http://lists.sourceforge.net/lists/listinfo/tapioca-devel
>
--
Richard Fish, Unix/Linux Software Engineer, rj...@fi...
|
|
From: Eric L. G. <er...@ba...> - 2001-07-28 04:12:27
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Okay, here's the deal. On *BSD, setlogin(pwd->pwname) sets the login name
on my forked process (right before I exec, sigh). On Linux,
setenv("LOGNAME",pwd->pw_name,1) sets the login name. Easy enough to
detect, just put a check for whether setlogin exists in the standard
libraries, right? Nope, WRONG! setlogin *does* exist in glibc, it just
always fails!
Okay, anybody, how do I write an autoconf test for something that
exists, but always fails?
At the moment I just call both of the above, but I'm tired of the
compiler bitching at me!
Regarding the plumbing, I have reworked somewhat how this is going to work
in order to use more of other people's code. In particular, I swiped
inetd.c from OpenBSD and am using it to create some services using the
standard inetd mechanism, rather than creating my own server. This means I
will use more ports, but the individual services will be easier to write
because they will not have to actually handle accepting connections, and
will be easier to debug because I can debug them from the command line
rather than having to go over the network to debug them (I remember what a
pain in the $#%@ that is!). In addition, thanks to tcpwrappers, I can
make it obey the hosts.allow and hosts.deny files, which many people will
like, all without me writing any code (though I #ifdef'ed all the RPC
code out).
tapinetd comes in at a sleek 20k right now, so this isn't exactly pork,
even if it is overkill for what we need. I use tapinetd rather than the
system-provided inetd because a) some systems don't have inetd (e.g., Red
Hat has moved to vinetd), b) I want to keep things in
/etc/tapioca/tapinetd.conf rather than /etc/inetd.conf, and c) it makes it
easier to make a boot/root disk than having to copy inetd and inetd.conf
off the system (not to mention that the system inetd will not be as sleek,
because it has stuff in it that I've #ifdef'ed out in tapinetd!).
Right now I'm using two ports -- a keymanager port, and an exec port. Call
the exec port without a proper session key obtained from the key manager,
and you get bitched at. Or will get bitched at, once I get the key manager
going :-).
Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg
mailto:er...@ba... Web: http://www.badtux.org
"Emacs is a nice OS, but a weird editor." -- M.J. Blom
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7Yjg83DrrK1kMA04RAuWmAKCJW9VpbBxZjxU9aNcE5f3LlKoWRQCeIgDV
/b8o0rYfzL7tog3k1fuaOS0=
=/iC4
-----END PGP SIGNATURE-----
|
|
From: Richard F. <rj...@fi...> - 2001-07-23 00:05:43
|
On Sun, 22 Jul 2001, Eric Lee Green wrote: > Re: Internationalization: Will think about that. You're right, we need to > do something there. Agree about the feebleness of the C++ String class, > even the Java String class is better (at least it can represent all known > international character sets!). More after I've thunk on it :-). C++ already handles wide characters. string is really a typedef of basic_string<char>, and wstring is a typedef of basic_string<wchar_t>. All we really need there is a couple of simple typedefs: #ifdef UNICODE #define TCHAR wchar_t #define _T(x) L##x #else #define TCHAR char #define _T(x) x #endif typedef basic_string<TCHAR> tstring; Then, we just use tstring in all places we would use string, and wrap all of our string constants in the _T macro. This way, we can switch between 8 and 16-bit characters just by defining UNICODE. There are probably a few other things to class this way, like input and output streams, etc. But they are fairly easy to handle. FYI, this is (sort of) what windows programs do to maintain source portability between NT/2000, which support Unicode throughout, and 95/98 which have very limited Unicode support. The trick is we need an enhanced basic_string class, that can handle the useful operations like paramter substitution and search-and-replace. And it, or a super-class, should also be able to handle the transparent language translation. What I don't know is if Java can handle ANSI strings passed to it from C++ code. Any comments? BTW, there are only a couple of problems with 16-bit character strings, the most important is that tstring.length() != sizeof(wstring.data()). If you want to know the actual byte-length of a string for an IO operation, you have to do tstring.length() * sizeof(TCHAR). -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: Eric L. G. <er...@ba...> - 2001-07-22 22:59:24
|
On Sun, 22 Jul 2001, Richard Fish wrote: > I also need to spend some time and re-learn java. Any good tutorials you > can point to? http://java.sun.com has some very good tutorials, as well as very good manuals, manuals that are as good as the Python manuals. "Java in a Nutshell" has a reasonable introduction to Java, I went ahead and bought the whole ORA Java series in electronic format so I could haul it around on my laptop. However, to tell you the truth, I learned Java via the immersion method -- "Here's some buggy Java code, fix it." Re: Internationalization: Will think about that. You're right, we need to do something there. Agree about the feebleness of the C++ String class, even the Java String class is better (at least it can represent all known international character sets!). More after I've thunk on it :-). Eric Lee Green mailto:er...@ba... BadTux: http://www.badtux.org GnuPG public key at http://badtux.org/eric/eric.gpg |
|
From: Randy K. <ra...@sh...> - 2001-07-22 21:38:54
|
On Sun, 22 Jul 2001, Richard Fish wrote: > No problem. I've been a quite pre-occupied with learning Windows > programming. I've finally gotten to a point where I can be moderately > productive, so am also ready to start working again on tapioca. > I'm currently sucking up the knowledge needed to develop various hardware drivers for a custom Motorola ColdFire-based board, so I don't know when I'll be able spare neuron cycles to pick up Java and contribute codewise. Randy |
|
From: Richard F. <rj...@fi...> - 2001-07-22 21:18:18
|
On Wed, 18 Jul 2001, Eric Lee Green wrote: > Note: Sorry about the delay on commenting on this message. Had a slight No problem. I've been a quite pre-occupied with learning Windows programming. I've finally gotten to a point where I can be moderately productive, so am also ready to start working again on tapioca. To keep this short, I think we agree on almost all points now. I've got a windows layout manager class that I need to finish up today. But, since I am *limited* to a 40-hour week working for Kirk, I can spend some time this week on tapioca. Hmm, I think I'm starting to like this hourly gig!! The two main things I'd like to accomplish this week are: 1. Get a spec together about how we are going to handle localization. 2. Along those same lines, an enhanced C++ string class(es) to support that. The one really sad part of the C++ standard library is the string class. Compare it to, for example, Qt's QString class to see what I mean. I also need to spend some time and re-learn java. Any good tutorials you can point to? -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: Eric L. G. <er...@ba...> - 2001-07-19 06:37:52
|
Note: Sorry about the delay on commenting on this message. Had a slight crisis at work (a member of the team quit and we had to go into emergency mode to fill in for her). Things are settling down a little now so we should be able to move forward a bit. I am currently working on (generating code for) the network block protocol for the plumbing. That can go on in parallel to the other work. I have also found a different crypto toolkit which is smaller than OpenSSL and which is native C++, so will work better with the Plumber (which is C++). The crypto toolkit is called "Crypto++". Now all I need is a working C++ compiler (the one that comes with Red Hat 7.1 is busted, it will not create executables, it won't even compile "hello_world.cpp", I'm waiting for the fixed one to download from ftp.redhat.com). On Sat, 7 Jul 2001, Richard Fish wrote: > On Wed, 4 Jul 2001, Eric Lee Green wrote: > > This may not work with backups to Jazz disks, DVD-RAM disks, or MO disks. > > Given my current employer, I very much want this to work with DVD-RAM > > disks and MO disks :-). I will experiment tomorrow and find out exactly > > what kind of write sizes these jukeboxes will handle. I know that the MO > > disks do require all writes to be at least 2K, or a multiple thereof, > > Well, my hope is that we could provide a buffer of a particular size, and > have the driver figure out how to transfer that to the device. As long as I did some experiments, and it turns out that the Linux drivers, at least, will handle breaking up any writes to the MO drive to 2K chunks. But the writes have to be a multiple of 2K. This indicates that we are going to need to maintain some information about block devices that's different from what we maintain about tape devices, and bear those in mind when we are deciding what block size to use for a particular backup, but I don't see any real problem there. The central authority knows what devices the data is going to be saved to, and can tailor a buffer size that's a multiple of block size for all of them. > > obvious. If I have one system with 100gb of data, and one system with 5gb > > of data, and multiplex the two, then the second stream will finish > > probably before the end of the first volume, except the actual tape writer > > really has no way of knowing that. It knows it hasn't seen any data from > > that particular stream for a while, but doesn't know whether it's just a > > case of the source for that stream being busy or something else. > > Well, it should get an EOF on the socket. That will tell it the stream is > done, at which point, it can mark it as inactive/finished/whatever. The multiplexor gets an EOF, but the actual tape writer, downstream of it, doesn't. It just knows it's getting blocks, and plunks them to disk or tape or whatever. > It's a very nice feature from an administration standpoint, because I can > say I want to backup my entire workground of 30 machines, and multiplex > them 4 at a time, and let the software figure out how to do it. It avoids > me having to micro-manage the backup, and ensure the backup is completed > in the least amount of time. I like this. It does mean that we must have the central controller assign all archive ID's rather than the client. > > We'll need to think about restores, and how to specify what to be > > restored. I'm thinking that the low level restorer thingy will just be fed > > a sequence of block numbers, and will fetch those blocks and put them out. > > Then something upstream will actually strip the proper file info out of > > the stream. This requires a simple SQL statement to grab the block numbers > > for the files and output them. Hmm, that tends to indicate that we want a > > 'last block' field in the database too, for indicating the block that > > contains the end of the file. (and perhaps can indicate a range of blocks > > to the restorer thingy). > > Hmm, this would indicate that we can't do anything less than a full > restore without a catalog, even using low-level, command-line accessible, > utilities. I can accept that. Naw. If you feed a block range of -1:-1 to the restorer, it just dumps the whole volume set to the output. Upstream, you can use a filter that filters it by filename or whatever criteria you want, but the low level tape reader knows nothing about files. It can't know anything about files, because files depend upon knowing the details of archive streams. All it knows about is blocks. But another widget upstream can strip out the file data that we're interested in. This works pretty much like 'tar' already works. If you restore a tape with 'tar' and just want file '/foo/bar', it'll read through the entire darned tape looking for /foo/bar, even after it's already restored /foo/bar. Obviously we want to use a catalog and only dump the blocks of interest, but I really would prefer that the restore scanner thingy not have to know anything about filesystems. Paths and path extensions vary greatly depending upon operating system, and some things don't even really have a path (like an Oracle database dump stream using, hmm, what was the name of that standard protocol for getting a dump stream from databases?). > The alternative is to have the lowlevel archive/backup/whatever scanner > have the ability to also look for paths and path extensions. Essentially, > to treat the archive/backup/whatever as a filesystem tree. I'd prefer that the tape/disk/punch card/etc. low level drivers not need to know anything about files, and that this all be decided upstream. They know about archive streams -- that should be easy enough, we've already specified that any given i/o block contains data only from a single archive stream, we can put that ID at the top of the i/o block and filter on it -- but because we want to be able to restore any kind of data, not just filesystem tree data, I'd prefer that we *not* embed any kind of filesystem logic. For example, I can foresee an LDAP data sourcer that would produce a stream of LDAP-formatted data. There's no reasonable way to treat that as a filesystem dump or restore, and I wouldn't even try. Let some specific LDAP-knowledgable widget handle knowing what an LDAP key looks like and how to set LDAP variables in the destination LDAP directory. > If we eliminate this concept, then we can also eliminate certain other > unstated requirements that were in my head, like that all entries in a > particular directory be contiguous within a single archive stream. I.e, > that /usr/bin comes after /usr and before anything else not in the /usr > heirarchy. Correct. We want to keep the actual low-level i/o engine as simple and stupid as possible, because I envision that we will actually have dozens of these guys eventually -- e.g., one that will back up to CD-R's, one that will back up to DVD-RAM disks, one that will back up to a sequence of 2GB files, etc. This also indicates that we don't want to think in terms of "block numbers" -- we want to think in terms of "location identifiers". For example, for the sequence of 2GB files case, the actual location of a file could be "/data/storage/345:32768", meaning that the data we want is at location 32768 in file /data/storage/345. > This could lead to some other interesting features, like multi-threading > backups on the agent and processing different filesystems in parallel. > Heck, we could even consider breaking a single backup-object (file, > database, etc) into segments that could be interspersed with other > segments within the archive stream! Although, I think that is complexity > we can avoid at this point. See my sequence of 2GB files case :-). I do agree that this would allow some neat stuff like the multi-threading backups on the agent! But we'll leave that for future versions :-). > As far as the database goes though, I think we primarily want to store the > starting block of a new backup object (file, database, etc), rather than > all of the blocks or range of blocks that contain it. I think if we try > to describe all of the blocks that make it up, we will essentially end up > indexing every single archive block in the database, and that is a *lot* > of data. We can't think in terms of blocks. We must think in terms of location. As for the describing all blocks, that's actually fairly easy. You just have two locations: the location containing the beginning of the file, and the location containing the ending of the file. When you go to the restore, you feed that range to the restorer, after doing a 'sort' and 'uniq' on the whole mess. I think that the ending block is a pretty easy one for the 'processor' widget for a particular type of archive to figure out. Here's what happens. A processor widget gets a block on its input. It records what start-of-files are in that block, and what end-of-files are in that block. It then writes the block. It eventually gets a location back for that block. It looks at all end-of-files for that block, goes fetches the start-of-files data for all end-of-files in that block (hash tables are great, eh!), and writes out the database record. Quite simple. You forget that C++ has standard vector and hash table classes as part of the STL! (Of course we have to remember to zap the data out of the hash table after we write it!). I'm starting to feel a *LOT* better about using C++ than I was feeling a few weeks ago. Having these kinds of classes sitting there for use means we can think of doing things like this, like we'd do them in Python, without worrying about all the code we'd have to write to do it. > This does mean that the multiplexing writer thingy needs to be fair in how > it choses what archive stream to service next, so that we don't have 10GB > of data to read through to find the next block, unless the system it came > from had some kind of problem. I agree. It should attempt to do a round-robin service on all of the available inputs. > There could be some effeciency gained on a restore from being able to use > QFA seeks to skip ahead, but I don't think it's worth the extra catalog > storage. Again, I agree. The range is enough. The fact that we had to read 30 blocks to restore 5 blocks of data is trivial for most cases. > > The platform and type of stream is indicated by the writer object. An NT > > writer object will have a different object ID than a Linux EXT2 writer > > object or a MacOS writer object, because they have different data formats. > > Writer objects may have, e.g., a pathname translator function, associated > > with them. We don't need to put more indicator ID's into the headers, > > because these indicator ID's are associated with the object that is > > creating this stream. Sort of, if I have a code 0x5324 that is a "Oracle > > NT Database Dump Object", this object may have a pathname code of 0x53 > > associated with it ("SQL Database/Table Names"), an originating OS code of > > 0x05 (Win32), etc... but all we need in the records on tape is the object > > id of the creating object. > > > > This means we have a central repository of object information, but that's > > easy enough to accomplish (we're going to have a SQL database, after > > all!). > > I think I got it....what if we could interconnect the writer objects for a > restore. Let me explain: during a backup, the process is simple and > obvious (from the data flow perspective): > > NTFS file --> NTFS_File_Obj --> ArchiveObj -> .... > EXT2FS file --> EXT2_File_Obj --> ArchiveObj -> .... > Oracle DB --> Oracle_DB_Obj .... > > Let's think about those File and DB objects for a second -- they need to > read object specific data (ACLs, extended bits, file attributes, datafile > locations, etc), format it into a form they can read later, and read and > format the data from the file or database. All of that get's sent off to > the archiving process. > > For a restore to the original location, the process looks like: > > ... --> ArchiveObj -{NTFS_File_Stream}-> NTFS_File_Obj --> NTFS file > ... > > In the above, the NTFS_File_Obj gets (via push or pull) the same data > stream(s) it originally stored via the archive object (we hope!). Since > it wrote that data, it knows how to decode it to get the filename, file > attributes, and the data section. So, it has no problem recreating the > original file, with the original attributes, etc. > > The problem is how to handle the foreign data case: > > ... -> ArchiveObj -{NTFS_File_Stream} -> EXT2_File_Obj -> EXT2FS file > or > ... -> ArchiveObj -{NTFS_File_Stream} -> NTFS_File_Obj -> EXT2FS file > > One way is as you suggest, to have a translater object to convert the > NTFS_File_Stream into a EXT2_File_Stream, that can be read by the > EXT2_File_Obj. But that involves a set of objects that have to be > maintained in sync with the writer objects. Instead, what if we leverage > the knowledge and code already in the NTFS_File_Obj, and tell it to write > it's data via an EXT2_File_Obj. It looks like this: > > ... -> ArchiveObj -{NTFS_File_Stream} -> NTFS_File_Obj > --> EXT2_File_Obj -> EXT2FS file You still must have a converter object here to, e.g., map from an NTFS filename to a EXT2FS filename. Whether the conversion is done at a low level via calling a method for an object or via an external translator component, it's going to have to be done. Note that the converter object can most certainly use the C++ classes used by the writer object to handle most of that, I certainly wasn't suggesting that we had to maintain two totally separate sets of classes, just that there was a step involved in translating from one format to another. We might think further on how to make this translation process easier, but I do not think we need to worry about it for the moment. For the moment it will probably suffice to say "you can restore NT files only to NT systems, and LDAP directories only to LDAP directories". The beauty of a component architecture is that you can always add more components into the stream later. We want something that works in a relatively short time. I'm tired of tarring up my notebook to my desktop then backing up my desktop, I want a real network backup again and no, I'm not going to install Arkeia to do it, I want something Open Source! > the important thing is the 'translation' is done by the writer/reader > objects themselves, without needing another class of objects to handle > that. No problem with that, just noting that we must have format translation somewhere if we are going to do cross-platform restores, whether it is as part of writer objects or as separate components that sit in the stream. Separate components (built with those objects, sitting in a pipeline between source and destination) are easier to hack into the pipeline later, but impose a performance penalty. Still, how often are we going to do cross-platform restores, and is the performance penalty going to be severe enough that we really care? > Now, there are some connections make no sense. For example, restoring > file data to an oracle database object would be non-sensical. So there > are a couple simple rules to implement in code: > > 1. File objects connect only to other file objects. > 2. All other objects connect to file objects, or themselves. All I would suggest is that we keep things simple wherever possible. "Release early, release often" is the goal. If we get working code on the site, then we can possibly get other contributors to do things like, e.g., hack on translation objects. Remember, this isn't you and me and Randy sitting in our cubicles anymore, this is an Open Source project, and the more we can parallelize the development, the more hackers we can attract to working on it and have them make meaningful contributions. But nobody's going to participate until we can do actual network backups and restores. Thus I suggest that we defer talk on format translations, and concentrate on the low level block format. > > I think an I/O block size has to be passed to the low-level stream > > creator, because it is in the best position to figure out the best way > > of spanning (or padding) at I/O block boundaries. An after-the-fact > > 'chunker' is not as good there. > > Except that means that the lowlevel stream objects need to know about the > archive format, header sizes, etc. That is not good. No it doesn't. It needs to know that x bytes are reserved for a header, but then it just fills in the rest of the buffer with its data and passes it on to the next component, and passes it down the line with each component filling in its own piece of the puzzle as desired. This does mean that it needs to "know" that it owns, say, bytes 256 to (n-12) of a 32768-byte buffer block, but that's not too difficult. > I think the stream creators create exactly that, a stream of data (here's > some data, write it to the archive). By "archive" do you mean the sum total of data being written to tape? Or do you mean a single backup stream? For simplicity's sake, it makes sense to organize each backup stream into tape-block-sized (or some kind of io-block-sized) chunks, each of which is tagged with what backup stream wrote it. This way the low level tape reader can read back stuff based on what backup stream it came from, without knowing anything about what's actually in those io-block-sized chunks. I think I see where you're going. You're saying that the stream processor -- the thingy that takes the raw stream of objects from the stream creator and does any processing necessary and figures out what data needs to be logged by the database to eg. indicate start and end of files -- should be the one that actually chunks it into IO-buffer-sized blocks, because that makes it easier for it to decide where files start and stop (and span things across block boundaries) without having to re-parse the IO-buffer-sized block back into its component raw stream blocks. Okay. That's fair. I think that'll work. It'll also make the agent side code smaller, which makes it more feasible that we could possibly create rescue disks for this thing, since the stream processor This way filesystem readers can handle the spanning in a way that makes sense. They know, e.g., that if they have a 'filename' block that's 105 characters long, it makes no sense to span that across a buffer boundary that's 30 characters away, so they can create a 'padding' block of 30 bytes, and continue on to the next block. There is no such thing as an 'archive object', by the way, just an 'archive block' object which the various writers know how to read and write off of tape or disk or whatever media we're backing up to. I think we need to get away from the whole notion of an "archive". What we have is a "backup", which consists of one or more "backup streams", each of which could represent any kind of data (some could be filesystem backup streams, another might be a backup of the NT registry, etc.). Each kind of backup stream type must have its own routines for deciding what sub-blocks within the overall 'archive block' mean, and thus for chunking data. > The archive object inserts whatever > headers and structures it needs to be able to validate and return that > data back to the stream object for a restore. It also takes care of > overflowing data across block boundaries (with new structures, etc). Again, I state that we should ban the word 'archive' in favor of the term 'backup' and 'backup stream', where a 'backup' consists of blocks from multiple 'backup streams'. That will probably save some confusion. > The point is, that having any layer try to format/manage data for another > layer is difficult to maintain, and violates the black-box encapsulation > rules. True, but somebody does have to know how to chunk things. We can't do anything about that :-(. I do believe that moving the chunking out to the processors, rather than putting it in the creators, is probably a good thing because that results in smaller creators (and the creators are the things that live out on the remote systems... the smaller we can get these things, the more probable we can make rescue disks with them). > > Can we come up with catchier names than this? I do agree we have a > > terminology problem here. 'archive streams', 'backup streams'. No? > > > Tomorrow I guess I get to work on terminology and see if I can come up > > with some definitions that make sense. > > Let's see (brainstorming here), > > Addict - (A)ttention (D)eficit (D)isorder (I)nfli(C)ting (T)ask > (aka. the multiplexor) > Archie - A 'stream' of data representing an object in an archive > Archive - A collection of Archies representing a single backup > set (constrained to a single system?) > Backup - A collection of one or more archives, one one or more media > volumes. > Vault - The collection of all Backups done by Tapioca. > > Hmm, none of these have anything to do with pudding, deserts, or even > food. Anybody else? My head hurts. Let's move on to what a backup header (the thing at the start of backup volumes) looks like, then what a Unix filesystem stream looks like. Note that because we've defined this so flexibly, we can later add, e.g., an ext2 filesystem stream thingy or etc., and by bumping the object ID still be able to read the archive using the previous-generation object. So I don't think we have to be quite as careful about the actual (non-io-related) contents of the backup blocks as we thought, as long as we get the backup block and backup header formats fixed in stone. From thence onwards, as long as we can pull out the object ID of the entity to be used to restore the stream out of the backup header, we can restore this, and if we change the format of a backup stream, we just bump our object ID and retain a copy of the old restorer at the original object ID so that we can continue to restore streams of that format. Yes, a pain, and to be avoided whenever possible, but backwards compatibility cruft is inevitable. You of all people should know that :-). Eric Lee Green mailto:er...@ba... BadTux: http://www.badtux.org |
|
From: Richard F. <rj...@fi...> - 2001-07-08 05:35:55
|
On Sat, 7 Jul 2001, Eric Lee Green wrote: > Yeah. I made it to page 135 of Stroustrup V3 today. I think part of the > deal is that we programmed in Python for 2 years, and thus know more about > object-oriented programming than the last time you programmed in C++. You > probably treated it a lot like "C" with some extra syntactic sugar last > time, though you're right, namespaces are new, they weren't in Stroustrup > version 2. I'm impressed with some of the STL classes, like hash tables, > growable vectors, real strings (as vs. arrays of chars) that aren't > succeptible to buffer overflows when you concatenate them together, etc. I > smell some cross-breeding with Java here... Stroustrup V2 didn't have this > fancy stuff. Oh, and another favorite -- stringstream classes that provide a way to produce formatted strings in memory without worrying about buffer overflows. Nifty! But yes, it does seem like the STL classes, which previously existed "outside" of C++, have been incorporated into the language definition. -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: Eric L. G. <er...@ba...> - 2001-07-08 04:26:25
|
On Sat, 7 Jul 2001, Richard Fish wrote:
> On Fri, 6 Jul 2001, Eric Lee Green wrote:
> > 1. The actual low level network communication widget just
> > sends stdin/stdout across the network, encrypting it via a stream cipher
> > (I suggest RC4, since OpenSSL does RC4 just fine).
> >
> > 2. At each end, there is a multiplexor for the multiple inputs
> > (stdout, stderr, various named pipes), and a de-multiplexor that
> > takes a stream coming from the other end and turns it back into the
> > multiple streams, sending those streams wherever.
>
> Sounds good. Saves us from having to open a second socket for
> stderr/control. And since we are talking about some kind of protocol
> between the MP and DMP, that can also incorporate a close command that
> causes a particular channel to close.
My biggest thing is keeping anything that actually touches the actual
network socket as simple as possible, so there's less chance of buffer
overflows, memory leaks (since this thing is long-running and persistent)
and such. When I got into it, I saw that I was going to need something
that was multi-threaded in order to handle the write part of
de-multiplexing (writers can block if the program at the other end of the
pipe is not ready for input yet, and that was not acceptable). I did *not*
want the thing sucking on the socket to be multi-threaded (at least not on
Unix), for reliability reasons. If they had to be seperate processes, I
figured they might as well be separate programs too.
I've started layout out the class heirarchy in .h files and thinking about
how it all fits together, and yes, I've got a CLOSE command for i/o
channels in there, and some other stuff. I'm still laying out the packet
format, but I think I have one that will allow the actual network pieces
to be extremely simple and reliable and easy to audit for security, while
allowing the seperate mux/demux that actually handles connections to do
fancy stuff like propogate selected environment variables, run simple
'recipes' telling it what pipes to open and what commands to start up with
each of them (sort of like scripts, but much simpler so that an
interpreter is easy to write for them for Windows), etc.
> > than writing from scratch. Sigh. Oh well, time to crack the Stroustrup.
>
> Yes, Stroustrup (Special Edition) is now my bible. I discovered that C++
> has changed a bit in the last few years. Either that, or my knowledge of
> C++ wasn't quite correct to begin with!
>
> For example, the C declaration "NULL" is now passe. You initialize
> pointers with "0", not NULL. It's a hard habit for me to break.
>
> Namespaces are also new, and something I haven't used.
Yeah. I made it to page 135 of Stroustrup V3 today. I think part of the
deal is that we programmed in Python for 2 years, and thus know more about
object-oriented programming than the last time you programmed in C++. You
probably treated it a lot like "C" with some extra syntactic sugar last
time, though you're right, namespaces are new, they weren't in Stroustrup
version 2. I'm impressed with some of the STL classes, like hash tables,
growable vectors, real strings (as vs. arrays of chars) that aren't
succeptible to buffer overflows when you concatenate them together, etc. I
smell some cross-breeding with Java here... Stroustrup V2 didn't have this
fancy stuff.
--
Eric Lee Green mailto:er...@ba...
BadTux: http://www.badtux.org
GnuPG public key at http://badtux.org/eric/eric.gpg
|
|
From: Richard F. <rj...@fi...> - 2001-07-07 20:55:40
|
On Fri, 6 Jul 2001, Eric Lee Green wrote: > 1. The actual low level network communication widget just > sends stdin/stdout across the network, encrypting it via a stream cipher > (I suggest RC4, since OpenSSL does RC4 just fine). > > 2. At each end, there is a multiplexor for the multiple inputs > (stdout, stderr, various named pipes), and a de-multiplexor that > takes a stream coming from the other end and turns it back into the > multiple streams, sending those streams wherever. Sounds good. Saves us from having to open a second socket for stderr/control. And since we are talking about some kind of protocol between the MP and DMP, that can also incorporate a close command that causes a particular channel to close. > than writing from scratch. Sigh. Oh well, time to crack the Stroustrup. Yes, Stroustrup (Special Edition) is now my bible. I discovered that C++ has changed a bit in the last few years. Either that, or my knowledge of C++ wasn't quite correct to begin with! For example, the C declaration "NULL" is now passe. You initialize pointers with "0", not NULL. It's a hard habit for me to break. Namespaces are also new, and something I haven't used. -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: Richard F. <rj...@fi...> - 2001-07-07 20:07:54
|
On Wed, 4 Jul 2001, Eric Lee Green wrote:
> I think it makes everything easier except one thing: progress streams.
> We will have to see whether we can get a stream object out of an RMI
Another alternative is a reverse socket connection to a client-side thread
that will be handling the progress updates:
port = GimmeSocket(addr)
thread = ProgressMonitorThread(addr, port)
result = tapioca.exports.SendMeProgress(addr, port)
....
The tricky parts here are what to do about attempts to monitor a job that
has already exited.
> > 1. Archive volumes are written in fixed blocks of a configurable size.
> > Minimum IO block size is 16k.
>
> This may not work with backups to Jazz disks, DVD-RAM disks, or MO disks.
> Given my current employer, I very much want this to work with DVD-RAM
> disks and MO disks :-). I will experiment tomorrow and find out exactly
> what kind of write sizes these jukeboxes will handle. I know that the MO
> disks do require all writes to be at least 2K, or a multiple thereof,
Well, my hope is that we could provide a buffer of a particular size, and
have the driver figure out how to transfer that to the device. As long as
we provide the driver a buffer that is an even multiple of it's required
block size, it should be able to handle it.
Otherwise, we may have to do what you suggested, introducing another layer
below the archive IO engine that manages the actual IO size (vs. the
archive block size), doing multiple read/write operations as required.
Actually, we have to do something like this for reading archives from
pipes, since linux pipes will never give you more than PAGESIZE from a
single read operation, and may give you less.
> obvious. If I have one system with 100gb of data, and one system with 5gb
> of data, and multiplex the two, then the second stream will finish
> probably before the end of the first volume, except the actual tape writer
> really has no way of knowing that. It knows it hasn't seen any data from
> that particular stream for a while, but doesn't know whether it's just a
> case of the source for that stream being busy or something else.
Well, it should get an EOF on the socket. That will tell it the stream is
done, at which point, it can mark it as inactive/finished/whatever.
> > we could say that this archive will consist of 8 streams, but only 3 can
> > be written concurrently. This would allow the other 5 streams to be
> > written as previous streams are completed.
>
> The problem here is that if we want our volume header to include a list of
> *all* streams in the volume, that seems to indicate that either a) the
> central authority assigns stream ID's (thus knows what stream ID's are
> going to be assigned to everything started up), or b) everything starts up
> at the same time, and some are 'put to bed' for a while. I don't like the
> 'put to bed' thing, because sockets can time out, connections can time
> out, IP masq table entries can time out, etc. Nasty.
>
> So I guess we *could* do this. What we need to do, then, is have the
> central authority assign the stream ID's.
It's a very nice feature from an administration standpoint, because I can
say I want to backup my entire workground of 30 machines, and multiplex
them 4 at a time, and let the software figure out how to do it. It avoids
me having to micro-manage the backup, and ensure the backup is completed
in the least amount of time.
> > 3. Each IO block contains a header that specifies it's archive stream ID,
> > sequence number, checksum, etc.
>
> The actual writer probably wants a backup stream ID and backup stream
> sequence number and checksum. It doesn't know anything about archive
> streams, it just knows what I/O blocks look like. Some other widget
> that knows about archive streams actually writes header info.
I think we agree, and are just getting terminology mixed up.... :-(
> We'll need to think about restores, and how to specify what to be
> restored. I'm thinking that the low level restorer thingy will just be fed
> a sequence of block numbers, and will fetch those blocks and put them out.
> Then something upstream will actually strip the proper file info out of
> the stream. This requires a simple SQL statement to grab the block numbers
> for the files and output them. Hmm, that tends to indicate that we want a
> 'last block' field in the database too, for indicating the block that
> contains the end of the file. (and perhaps can indicate a range of blocks
> to the restorer thingy).
Hmm, this would indicate that we can't do anything less than a full
restore without a catalog, even using low-level, command-line accessible,
utilities. I can accept that.
The alternative is to have the lowlevel archive/backup/whatever scanner
have the ability to also look for paths and path extensions. Essentially,
to treat the archive/backup/whatever as a filesystem tree.
If we eliminate this concept, then we can also eliminate certain other
unstated requirements that were in my head, like that all entries in a
particular directory be contiguous within a single archive stream. I.e,
that /usr/bin comes after /usr and before anything else not in the /usr
heirarchy.
This could lead to some other interesting features, like multi-threading
backups on the agent and processing different filesystems in parallel.
Heck, we could even consider breaking a single backup-object (file,
database, etc) into segments that could be interspersed with other
segments within the archive stream! Although, I think that is complexity
we can avoid at this point.
As far as the database goes though, I think we primarily want to store the
starting block of a new backup object (file, database, etc), rather than
all of the blocks or range of blocks that contain it. I think if we try
to describe all of the blocks that make it up, we will essentially end up
indexing every single archive block in the database, and that is a *lot*
of data.
I think the restorer gets a starting block, an archive stream id, a data
stream id, and reads the media pulling out the blocks it needs. There
will be an indicator in the archive that a particular data stream is now
finished, so we don't even need the ending block.
This does mean that the multiplexing writer thingy needs to be fair in how
it choses what archive stream to service next, so that we don't have 10GB
of data to read through to find the next block, unless the system it came
from had some kind of problem.
There could be some effeciency gained on a restore from being able to use
QFA seeks to skip ahead, but I don't think it's worth the extra catalog
storage.
> I'm not sure that there's much we can do about corruption in today's world
> except report it, write the block somewhere in case a human needs to see
> it, and continue. If we've trashed the archive somehow (via a defective
> program or whatever), I certianly don't want to restore corruped data!
Of course, we could always do ECC for the headers (not data), but then the
format get's to be *really* complex (it's going to be complex enough as it
is!), and I suspect our performance would get hit pretty hard.
> The platform and type of stream is indicated by the writer object. An NT
> writer object will have a different object ID than a Linux EXT2 writer
> object or a MacOS writer object, because they have different data formats.
> Writer objects may have, e.g., a pathname translator function, associated
> with them. We don't need to put more indicator ID's into the headers,
> because these indicator ID's are associated with the object that is
> creating this stream. Sort of, if I have a code 0x5324 that is a "Oracle
> NT Database Dump Object", this object may have a pathname code of 0x53
> associated with it ("SQL Database/Table Names"), an originating OS code of
> 0x05 (Win32), etc... but all we need in the records on tape is the object
> id of the creating object.
>
> This means we have a central repository of object information, but that's
> easy enough to accomplish (we're going to have a SQL database, after
> all!).
I think I got it....what if we could interconnect the writer objects for a
restore. Let me explain: during a backup, the process is simple and
obvious (from the data flow perspective):
NTFS file --> NTFS_File_Obj --> ArchiveObj -> ....
EXT2FS file --> EXT2_File_Obj --> ArchiveObj -> ....
Oracle DB --> Oracle_DB_Obj ....
Let's think about those File and DB objects for a second -- they need to
read object specific data (ACLs, extended bits, file attributes, datafile
locations, etc), format it into a form they can read later, and read and
format the data from the file or database. All of that get's sent off to
the archiving process.
For a restore to the original location, the process looks like:
... --> ArchiveObj -{NTFS_File_Stream}-> NTFS_File_Obj --> NTFS file
...
In the above, the NTFS_File_Obj gets (via push or pull) the same data
stream(s) it originally stored via the archive object (we hope!). Since
it wrote that data, it knows how to decode it to get the filename, file
attributes, and the data section. So, it has no problem recreating the
original file, with the original attributes, etc.
The problem is how to handle the foreign data case:
... -> ArchiveObj -{NTFS_File_Stream} -> EXT2_File_Obj -> EXT2FS file
or
... -> ArchiveObj -{NTFS_File_Stream} -> NTFS_File_Obj -> EXT2FS file
One way is as you suggest, to have a translater object to convert the
NTFS_File_Stream into a EXT2_File_Stream, that can be read by the
EXT2_File_Obj. But that involves a set of objects that have to be
maintained in sync with the writer objects. Instead, what if we leverage
the knowledge and code already in the NTFS_File_Obj, and tell it to write
it's data via an EXT2_File_Obj. It looks like this:
... -> ArchiveObj -{NTFS_File_Stream} -> NTFS_File_Obj
--> EXT2_File_Obj -> EXT2FS file
In this model, the NTFS_File_Obj still does the decoding of the data
stream it wrote. But rather than create and write a file directly on the
filesystem, it was given a file object to do that. It assigns a path (via
the platform independant represenation) to the EXT2 file object, tells the
object to open for writing, and writes only the file data to it. We could
even make these things look like C++ stream objects, by giving them '<<'
and '>>' operators!
This can also be generalized for the restore to original case:
... -> ArchiveObj -{NTFS_File_Stream} -> NTFS_File_Obj(1)
-{calls and functions}-> NTFS_File_Obj(1) -> NTFS file
In the above, the first NTFS_File_Obj can see that it is 'writing' to
itself, and thus in addition to the steps above, can call it's own
functions to set ACLs, owner, permissions, etc.
The second object is determined by what system/filesystem we are restoring
to. Obviously, that can only be done once we know where the file is
going, so the first step is to initialize the first object, get it's path,
and determine the second, platform specific object, based on that. But
the important thing is the 'translation' is done by the writer/reader
objects themselves, without needing another class of objects to handle
that.
Now, there are some connections make no sense. For example, restoring
file data to an oracle database object would be non-sensical. So there
are a couple simple rules to implement in code:
1. File objects connect only to other file objects.
2. All other objects connect to file objects, or themselves.
> I think an I/O block size has to be passed to the low-level stream
> creator, because it is in the best position to figure out the best way
> of spanning (or padding) at I/O block boundaries. An after-the-fact
> 'chunker' is not as good there.
Except that means that the lowlevel stream objects need to know about the
archive format, header sizes, etc. That is not good.
I think the stream creators create exactly that, a stream of data (here's
some data, write it to the archive). The archive object inserts whatever
headers and structures it needs to be able to validate and return that
data back to the stream object for a restore. It also takes care of
overflowing data across block boundaries (with new structures, etc).
Think about the way we deal with a TCP socket, and pretend that the
archive object presents the same functionality to a file stream. When
writing a TCP socket, we don't need to know the MTU of the ethernet device
it will eventually get carried over. Other layers deal with structuring,
fragmenting, and padding.
The archive works the same way: if a file stream is too big to fit in a
single archive block, it get's fragmented. Too small, it get's combined
with other file streams (which doesn't happen with ethernet, but oh well).
But as far as we (the low-level file stream thingy) are concerned, we have
a very simple send/recv type interface.
We could also reverse the relationship between the lowlevel streams and
the archiver, and put the archiver in control, with the streams providing
the interface to the file/database:
<file/db stream to archiver> I'm a new stream to be archived
<archiver to stream> Put 29768 bytes of data at this address
<stream to archiver> I put 29768 bytes of data at that address
....
<stream to archiver> I have no more data (EOF).
....
This is essentially what I implemented in the fileio module in BRU, for
dealing with all types of files (regular, sparse, compressed, etc). This
allowed me to move the knowledge about how to read/write different files
out of the middle layers, which made them much, much simpler.
Unfortunately, the middle layer still knew everything about the archive
format, and was responsible for creating/decoding archive blocks to
provide to the archive layer.
The point is, that having any layer try to format/manage data for another
layer is difficult to maintain, and violates the black-box encapsulation
rules.
> Can we come up with catchier names than this? I do agree we have a
> terminology problem here. 'archive streams', 'backup streams'. No?
> Tomorrow I guess I get to work on terminology and see if I can come up
> with some definitions that make sense.
Let's see (brainstorming here),
Addict - (A)ttention (D)eficit (D)isorder (I)nfli(C)ting (T)ask
(aka. the multiplexor)
Archie - A 'stream' of data representing an object in an archive
Archive - A collection of Archies representing a single backup
set (constrained to a single system?)
Backup - A collection of one or more archives, one one or more media
volumes.
Vault - The collection of all Backups done by Tapioca.
Hmm, none of these have anything to do with pudding, deserts, or even
food. Anybody else?
--
Richard Fish, Unix/Linux Software Engineer, rj...@fi...
|
|
From: Eric L. G. <er...@ba...> - 2001-07-07 03:28:52
|
1. The actual low level network communication widget just
sends stdin/stdout across the network, encrypting it via a stream cipher
(I suggest RC4, since OpenSSL does RC4 just fine).
2. At each end, there is a multiplexor for the multiple inputs
(stdout, stderr, various named pipes), and a de-multiplexor that
takes a stream coming from the other end and turns it back into the
multiple streams, sending those streams wherever.
The next question is whether to make the low level network communications
an SSL socket, or not. My own thought is "or not". The problem is key
management. Specifically, the only way to do one-time keys (needed to send
data from a remote system directly to a remote drive without causing a
security hole) is for the server to provide the one-time key to both
parties under its own key. After the key is used to authenticate the
single transaction allowed by it, it goes away.
By the time I have all the key management issues wrapped up in a way that
is suitable for backups (rather than for web servers), I've already
implemented far more code than a simple loop that does a recv() on a
socket, calls the stream decrypter on the buffer to decrypt it, then
writes the buffer to its stdout. I can't see of any easy way to
use SSL certificate management to handle this problem, and am of the
opinion it's probably better just to write it from scratch.
I will spec this out further tonight, and probably start laying down
some code. I looked at OpenSSH. It is a mess. Not because the programmers
are incompetent, but because of all the backward compatibility stuff.
So modifying OpenSSH for what we want to do would probably take more time
than writing from scratch. Sigh. Oh well, time to crack the Stroustrup.
--
Eric Lee Green mailto:er...@ba...
BadTux: http://www.badtux.org
|
|
From: Eric L. G. <er...@ba...> - 2001-07-05 19:24:46
|
On Wed, 4 Jul 2001, Richard Fish wrote: > On Mon, 2 Jul 2001, Eric Lee Green wrote: > > > I've put the preliminary architecture documentation up at > > http://tapioca.sourceforge.net . That should give you a good idea of how > > this is structured. Aside from the 'pudding', the rest of the > architecture > > should be somewhat familiar, except for the parts inflicted by Java (like > > the tape server being multi-threaded rather than multi-process). > > Looks good. One question I have though is the implementation of the > tapicom protocol. Right. I deliberately did not go into that. We're still quite a ways away from caring about tapicom -- I want to get some network backups done without having to worry about chunking things into tape-sized bits myself to work around the limits of 'tar'! My home network needs something a little higher power than 'tar', sigh... > I have just updated my java reference library, and see that java has a > built in RPC mechanism called RMI (Remote Method Invocation). It's > simpler than CORBA or COM, although it only works between java objects. Yes. This is definitely worth investigating for Tapicom, since the client and the server will both be Java (as vs. the pudding, which is C or C++). > Maybe we should consider using RMI instead? The advantage here to me is > that we don't have to write a lot of code to format, write, read, and > parse data for every operation we perform, while also trying to figure out > the error detection, reporting, and handling mechanisms. If we want to > add a user to the database, we just call something like: > > result = tapica.exports.NewUser(authObj, userObj) > > If the call fails, it raises an exception that can get handled by the > client. > > Anyway, I think it's worth considering, since it makes the data and error > handling much easier. I think it makes everything easier except one thing: progress streams. We will have to see whether we can get a stream object out of an RMI call. The other alternative is to make progress streams client-driven rather than server-driven (i.e., they robotically call a 'tapioca.exports.GetCurrStatus(currBackup)' to get the current status info). I'm kind of reluctant to do that, because if the server is bogged down, this will bog it down even worse (as vs. the case of the updates simply getting slower, in a server-driven process). > > Now it's time to decide on the archive format. We need to get this right > > because changing it after we've started making backups will be a major > > pain in the %$#@@#. > > Agreed!!!! > > > Some criteria: > > > > 1. Must be able to multiplex (and demultiplex) multiple streams into the > > same tape file. This indicates that the input needs to be blocked in a > > structured way, rather than just being a plain old stream of data, and > > that each block needs to be tagged with a stream ID. > > Yep. Some proprosed requirements: > > 1. Archive volumes are written in fixed blocks of a configurable size. > Minimum IO block size is 16k. This may not work with backups to Jazz disks, DVD-RAM disks, or MO disks. Given my current employer, I very much want this to work with DVD-RAM disks and MO disks :-). I will experiment tomorrow and find out exactly what kind of write sizes these jukeboxes will handle. I know that the MO disks do require all writes to be at least 2K, or a multiple thereof, because I (ab)used 'dd' to 'erase' some disks last week when I was testing the progress window that I wrote for a web app (progress windows are *EASY* with Tomcat and Java Server Pages, because you can store the backup thread object itself into the session object -- which can update its own progress info, which the refresh will catch when the window does its auto-refresh thing every 5 seconds -- I bet Randy is green with envy, considering the stuff he had to go through to handle progress windows!). (Hmm, perhaps client-driven status displays aren't so bad after all?) (Hmm, or maybe not...) > 2. Each volume begins with a volume header block that describes the volume > and all still-active archive streams being multiplexed to that archive. > Note that this means that additional streams cannot be added to an archive > that is in process. However, it doesn't guarantee that this specific > volume actually contains any data for that archive stream. For example, This sounds quite good. As for the 'does not guarantee', that's pretty obvious. If I have one system with 100gb of data, and one system with 5gb of data, and multiplex the two, then the second stream will finish probably before the end of the first volume, except the actual tape writer really has no way of knowing that. It knows it hasn't seen any data from that particular stream for a while, but doesn't know whether it's just a case of the source for that stream being busy or something else. > we could say that this archive will consist of 8 streams, but only 3 can > be written concurrently. This would allow the other 5 streams to be > written as previous streams are completed. This would require some smarts on the part of the multiplexor, where we feed it how many pipes we want active at startup, and have a monitoring thread in the central authority that will monitor the deceasing of jobs and add jobs to the pudding and tell the multiplexor (via a control channel of some sort) "oh, by the way, you now have an additional stream on pipe /var/lib/tapioca/tmp/tp2134.532 to multiplex into the backup." The problem here is that if we want our volume header to include a list of *all* streams in the volume, that seems to indicate that either a) the central authority assigns stream ID's (thus knows what stream ID's are going to be assigned to everything started up), or b) everything starts up at the same time, and some are 'put to bed' for a while. I don't like the 'put to bed' thing, because sockets can time out, connections can time out, IP masq table entries can time out, etc. Nasty. So I guess we *could* do this. What we need to do, then, is have the central authority assign the stream ID's. > 2. Each IO block contains data for exactly one archive stream. Uhm, Let's decide on terminology. The whole multiplexed/duplexed/whatever stream of data is an archive stream? Or is it a backup stream? And the individual backups, are these backup streams? Or archive streams? We have a terminology confusion here :-). But I know what you mean here. > 3. Each IO block contains a header that specifies it's archive stream ID, > sequence number, checksum, etc. The actual writer probably wants a backup stream ID and backup stream sequence number and checksum. It doesn't know anything about archive streams, it just knows what I/O blocks look like. Some other widget that knows about archive streams actually writes header info. We'll need to think about restores, and how to specify what to be restored. I'm thinking that the low level restorer thingy will just be fed a sequence of block numbers, and will fetch those blocks and put them out. Then something upstream will actually strip the proper file info out of the stream. This requires a simple SQL statement to grab the block numbers for the files and output them. Hmm, that tends to indicate that we want a 'last block' field in the database too, for indicating the block that contains the end of the file. (and perhaps can indicate a range of blocks to the restorer thingy). I'm tired of calling the thingies in the pudding thingies. "component" is so, uhm, generic. What do we call them? > 4. The payload section of each IO block starts with a byte that indicates > a structure type: data stream header, data stream resource, data stream > data, etc. This is followed by the structure of the indicated type, and > any data it carries. Should I/O blocks also carry an 'originator type' ID? Or is it enough to stick this in the volume header? In any event, payload has to be marked somehow with what kind of thingy created the payload, so that we can restore the payload using the correct kind of thingy. (Spoons. Tapioca. Pudding. Spoons stir the pudding, no? Sorry, just getting silly here :-) > 5. Within each IO block the structures are variable length, ensuring > efficient space usage and performance. > > The downside to variable length structures and data is handling > corruption. With a fixed-length, BRU-style format, if a file data block > was corrupted, we could (and did!) just write the data section out to the > file, and note the error. On a file header block, we could have just > invented a file name, and wrote out the appropriate data section. > > But with variable length records, we can't even trust that the encoded > length is correct, and we don't know for certain if there is another > stream header structure in the block, and if so, where. We either have to > write very complex (and error-prone) code to try and make sense of the > corrupted data, or throw the whole block away and move on. I'm not sure that there's much we can do about corruption in today's world except report it, write the block somewhere in case a human needs to see it, and continue. If we've trashed the archive somehow (via a defective program or whatever), I certianly don't want to restore corruped data! > > 3. The tape format should not require doing a MT_TELL for every bloody > > block written to tape, only for blocks that actually need it (i.e, blocks > > that contain the beginning of a piece of data logged into the database). > > This tends to indicate that blocks need tagging with a "type" field. > > Hmm, how do we handle the catalog rebuild (reading the archive) case? We > don't know *if* we need to do an MT_TELL or not at the time we *need* to > do the MT_TELL, before reading the block. For rebuilding the catalog the MT_TELL thing is okay. It is buggy writers that are a pain. We need some way to work around idiotic firmware or idiotic device drivers that flush a tape drivers' buffers upon write (thus limiting performance to about 8K/sec!). Now that I think about it, you're right, the MT_TELL is not a big deal on restores or verifies. Writes are where it hinders performance. If we can come up with some (optional) scheme to limit # of times we need to call it, that'd be a big performance boost with some drives. > > 4. The format should be able to handle two things other than raw > > data blocks: > > a) producing location information suitable for logging into the > > central authority's location database for use in future restores, > > and > > b) holding any OS-specific data needed to fully restore the file. > > Right. And of course, we also want to be able to restore the data portion > of a stream (file or otherwise) on a different OS. That will require a filter of some sort in some cases, if only to convert file names and permissions to a reasonable thing. > > 5. The stream format will have to hold data about what kind of writer > > produced the data in the file, so that the file logger can properly > > account for the differences in display format and pass that data upstream > > to the user interface. We don't want to force Unix filename format onto > > Windows or Mac or etc.! > > Yep. Yet another lesson we learned! In the archive, paths should > probably be encoded into some platform-independent form, so that it can be > reconstructed for the platform we are restoring to. But we still want an > indicator of the original platform, for catalog and display purposes. > Oh, and let's not forget, the converters from the independant path to the > native path format will need to check for and handle invalid characters in > the path. Yep :-) > > Similarly, if we're backing up a database file > > dump stream (one possible data source) we don't want to have to pretend > > that it contains Unix-structured data, and we need to know it came from > > a database stream dumper rather than from a filesystem dumper, so that > > when we go to restore it we know what restorer to use! > > Yep. I'm finding it useful to think about streams as having 'names', not > paths. > > > So each type of data stream creator will need a unique creator ID of > > some sort to tell us what kind of widget created the data stream, and this > > gets put into the header so that we can grab it and know what to restore > > this data stream with. > > I've been thinking a lot about this, and am having difficulty. On the one > hand, we want a general ID that indicates very generally what this stream > is (directory, file, pipe data, database, etc), and a general way of > accessing it for cross platform support. > On the other hand, we also want to be able to identify a file as coming > from an ext2 filesystem, so we can backup and restore the extended ext2 > bits. In other cases, platform and filesystem specific ACL's need to be > handled. > > So it looks like we need at least 3 different indicator ID's for stream > headers. The first (1 byte?) to indicate the type of stream (directory, > file, pipe output, command output, etc), the second (1 byte?) to indicate > the platform of the given type (Windows, Mac, Unix for directories and > files; Oracle or MySQL for database streams, etc). The third (2 bytes?) > further classifies the type of stream based upon it's original writer > object. The platform and type of stream is indicated by the writer object. An NT writer object will have a different object ID than a Linux EXT2 writer object or a MacOS writer object, because they have different data formats. Writer objects may have, e.g., a pathname translator function, associated with them. We don't need to put more indicator ID's into the headers, because these indicator ID's are associated with the object that is creating this stream. Sort of, if I have a code 0x5324 that is a "Oracle NT Database Dump Object", this object may have a pathname code of 0x53 associated with it ("SQL Database/Table Names"), an originating OS code of 0x05 (Win32), etc... but all we need in the records on tape is the object id of the creating object. This means we have a central repository of object information, but that's easy enough to accomplish (we're going to have a SQL database, after all!). > This way, if we are running on an NT system where we don't have the Unix > file object class, we can use more general file object class to process I would suggest that we have a set of translation objects that can be thrown into the pudding where necessary to do any kinds of translations that we feel are needed. The NT restorer object thus doesn't have to know about anything except NT file stuff. It doesn't have to know that up higher, a "EXT2->NT" translator sat on the stream and re-wrote the header data to make it reasonable. > They also don't have to worry about processing data produced by other file > objects, since they would only get data that they (or their subclasses) > created. The translator stuff works there too. > Think about the archive API like a filesystem streams API. The file > object just reads/writes data, and doesn't worry about how that get's > stored on the media. I think an I/O block size has to be passed to the low-level stream creator, because it is in the best position to figure out the best way of spanning (or padding) at I/O block boundaries. An after-the-fact 'chunker' is not as good there. > > 8. Checksumming streams: We should probably only worry about checksumming > > buffer-sized chunks of data, not individual blocks of structured data. > We need some new terms. We are using 'streams' in the sense of archive > streams and interplexing, and 'streams' in the sense of data streams to be > archived. How about calling them something like d-streams and a-streams. > A d-stream is a stream of data to be archived. An a-stream is an archive > stream, and contains one or more d-streams. An archive is made up of one > or more a-streams. Can we come up with catchier names than this? I do agree we have a terminology problem here. 'archive streams', 'backup streams'. No? Tomorrow I guess I get to work on terminology and see if I can come up with some definitions that make sense. -- Eric Lee Green mailto:er...@ba... BadTux: http://www.badtux.org GnuPG public key at http://badtux.org/eric/eric.gpg |
|
From: Richard F. <rj...@fi...> - 2001-07-05 00:15:13
|
On Mon, 2 Jul 2001, Eric Lee Green wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hmm, seems we have three different sizes to worry about: > > 1. Chunk size -- if we are chunking the data into, say, 128K chunks > 2. Record size -- we have variable sized records within those chunks. > 3. Tape I/O size -- this may be anything from 512 bytes all the way to > 128K+ bytes. The only real criteria is that either chunk size is a > multiple of this, or this is a multiple of chunk size. > > Things getting complicated. How to simplify? We don't worry about Tape IO size. We only deal with Chunk size, which is both our IO size and the base-level archive structure size. It is configurable between 16k (?) and 512K (?). The only thing about the IO size is that since a stream header cannot be split across IO buffers, it must fit within a buffer. If we set the bottom end of this to 512-bytes, for example, then a file with a long name, a resource fork, and ACLs, would not be capable of being backed up, since it's stream header would not fit in an IO buffer. That's why I think a lower limit of 16k on this guy is appropriate. The IO size get's stored in the archive header block, so we know how much date to read/write with each IO operation ( in case we didn't figure it out while trying to read the header block! ). We could separate the archive block size and IO buffer size, but I don't see any real-world advantage to doing so. If we do this, then as you said, we just need to ensure that either the IO size or the archive block size is an even multiple of the other. -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: Richard F. <rj...@fi...> - 2001-07-05 00:05:00
|
On Mon, 2 Jul 2001, Eric Lee Green wrote: > I've put the preliminary architecture documentation up at > http://tapioca.sourceforge.net . That should give you a good idea of how > this is structured. Aside from the 'pudding', the rest of the architecture > should be somewhat familiar, except for the parts inflicted by Java (like > the tape server being multi-threaded rather than multi-process). Looks good. One question I have though is the implementation of the tapicom protocol. I have just updated my java reference library, and see that java has a built in RPC mechanism called RMI (Remote Method Invocation). It's simpler than CORBA or COM, although it only works between java objects. It appears that RMI can be easily integrated with any type of socket, including an SSL-socket. Maybe we should consider using RMI instead? The advantage here to me is that we don't have to write a lot of code to format, write, read, and parse data for every operation we perform, while also trying to figure out the error detection, reporting, and handling mechanisms. If we want to add a user to the database, we just call something like: result = tapica.exports.NewUser(authObj, userObj) If the call fails, it raises an exception that can get handled by the client. Anyway, I think it's worth considering, since it makes the data and error handling much easier. > Now it's time to decide on the archive format. We need to get this right > because changing it after we've started making backups will be a major > pain in the %$#@@#. Agreed!!!! > Some criteria: > > 1. Must be able to multiplex (and demultiplex) multiple streams into the > same tape file. This indicates that the input needs to be blocked in a > structured way, rather than just being a plain old stream of data, and > that each block needs to be tagged with a stream ID. Yep. Some proprosed requirements: 1. Archive volumes are written in fixed blocks of a configurable size. Minimum IO block size is 16k. 2. Each volume begins with a volume header block that describes the volume and all still-active archive streams being multiplexed to that archive. Note that this means that additional streams cannot be added to an archive that is in process. However, it doesn't guarantee that this specific volume actually contains any data for that archive stream. For example, we could say that this archive will consist of 8 streams, but only 3 can be written concurrently. This would allow the other 5 streams to be written as previous streams are completed. 2. Each IO block contains data for exactly one archive stream. 3. Each IO block contains a header that specifies it's archive stream ID, sequence number, checksum, etc. 4. The payload section of each IO block starts with a byte that indicates a structure type: data stream header, data stream resource, data stream data, etc. This is followed by the structure of the indicated type, and any data it carries. 5. Within each IO block the structures are variable length, ensuring efficient space usage and performance. The downside to variable length structures and data is handling corruption. With a fixed-length, BRU-style format, if a file data block was corrupted, we could (and did!) just write the data section out to the file, and note the error. On a file header block, we could have just invented a file name, and wrote out the appropriate data section. But with variable length records, we can't even trust that the encoded length is correct, and we don't know for certain if there is another stream header structure in the block, and if so, where. We either have to write very complex (and error-prone) code to try and make sense of the corrupted data, or throw the whole block away and move on. One thing we might want to consider is some kind of ECC encoding for the structure headers. I don't have any clear idea of how much ECC to do, or even, where it should go in the format. I also don't have any clear idea of the performance impact of ECC. > 3. The tape format should not require doing a MT_TELL for every bloody > block written to tape, only for blocks that actually need it (i.e, blocks > that contain the beginning of a piece of data logged into the database). > This tends to indicate that blocks need tagging with a "type" field. Hmm, how do we handle the catalog rebuild (reading the archive) case? We don't know *if* we need to do an MT_TELL or not at the time we *need* to do the MT_TELL, before reading the block. I suppose we could calculate the tape block size when we open the volume (a=MT_TELL, read_block, b=MT_TELL, tape_block_size = b-a). Then we could read the block, and it contained stream headers, fudge the QFA position. > 4. The format should be able to handle two things other than raw > data blocks: > a) producing location information suitable for logging into the > central authority's location database for use in future restores, > and > b) holding any OS-specific data needed to fully restore the file. Right. And of course, we also want to be able to restore the data portion of a stream (file or otherwise) on a different OS. > 5. The stream format will have to hold data about what kind of writer > produced the data in the file, so that the file logger can properly > account for the differences in display format and pass that data upstream > to the user interface. We don't want to force Unix filename format onto > Windows or Mac or etc.! Yep. Yet another lesson we learned! In the archive, paths should probably be encoded into some platform-independent form, so that it can be reconstructed for the platform we are restoring to. But we still want an indicator of the original platform, for catalog and display purposes. Oh, and let's not forget, the converters from the independant path to the native path format will need to check for and handle invalid characters in the path. > Similarly, if we're backing up a database file > dump stream (one possible data source) we don't want to have to pretend > that it contains Unix-structured data, and we need to know it came from > a database stream dumper rather than from a filesystem dumper, so that > when we go to restore it we know what restorer to use! Yep. I'm finding it useful to think about streams as having 'names', not paths. > So each type of data stream creator will need a unique creator ID of > some sort to tell us what kind of widget created the data stream, and this > gets put into the header so that we can grab it and know what to restore > this data stream with. I've been thinking a lot about this, and am having difficulty. On the one hand, we want a general ID that indicates very generally what this stream is (directory, file, pipe data, database, etc), and a general way of accessing it for cross platform support. On the other hand, we also want to be able to identify a file as coming from an ext2 filesystem, so we can backup and restore the extended ext2 bits. In other cases, platform and filesystem specific ACL's need to be handled. So it looks like we need at least 3 different indicator ID's for stream headers. The first (1 byte?) to indicate the type of stream (directory, file, pipe output, command output, etc), the second (1 byte?) to indicate the platform of the given type (Windows, Mac, Unix for directories and files; Oracle or MySQL for database streams, etc). The third (2 bytes?) further classifies the type of stream based upon it's original writer object. This way, if we are running on an NT system where we don't have the Unix file object class, we can use more general file object class to process the data. In other words, the lowlevel specific file object readers and writers don't have to worry about whether the host platform supports them or not. They will only be available on the platforms that support them. They also don't have to worry about processing data produced by other file objects, since they would only get data that they (or their subclasses) created. File Stream Object class heirarchy: GenericFile - Available everywhere. Processes all types of file data |-- NTFile - Available on NT. Processes file data for all NT filesystems | |-NTFSFile - Availble on NT, when backing up or restoring to NTFS filesystems | |-- UnixFile - .... > 6. For volume changes, the full header information should be replicated > on the new volume, along with what volume we're working on etc. so that > if we have a tape that is a volume 2, we have more of a chance of > associating it with the correct volume 1 if we have to do this by > hand. > > 7. Fixed-size blocks, or variable-sized blocks? Fixed-sized blocks, like > 'tar' uses, are easy to deal with, and can be easily packed into > larger buffers (as long as said larger buffers are a multiple of the > blocksize in length). However, each block adds overhead. If the > block size is too small, overhead becomes too much of a percentage of > the block. If the block size is too large, then we have too much > wasted space at the end of the block. > > Variable-sized blocks could be used, but we could require that these > be packed into a fixed-size buffer of some large size (perhaps > 64K or 128K) such that each buffer begins with a block and no block > spans buffers. This is a pain, but results in less wasted space and > thus better performance in the end. Note that if we limit the > variable-sized blocks to 32k in size, we can represent the size of the > block with only 2 bytes in the block's header. I think variable sized works best, as we don't worry about wasting space, which is really the biggest overhead in fixed size blocks. And 2 bytes gives us a maximum 'chunk' size of 64k. Note that we can pack variable sized chunks into a fixed size IO block. For example, assume the following 128k fixed IO blocks (these sizes are arbitrary): IO header - 32-bytes Stream header(1) - 196-bytes (encodes name length, resource length, and data lengths, etc) - Stream name - 34-bytes - Stream resource - 132-bytes Stream data(1) - 2345-bytes - header 12-bytes - data 2333-bytes Stream header(2) - 96-bytes - name - 42-bytes - resource - 23-bytes Stream data(2) - 48401-bytes ..... Stream data(4) - 836-bytes - header 12-bytes - data 824-bytes # END OF BLOCK 2 here, stream 4 not finished # Start of BLOCK 3, IO Header - 32 bytes Stream data(4) - 65504-bytes - header 12-bytes - data - 65492-bytes Stream data(4) - 1804-bytes # last block of stream 4 - header 12-bytes - data - 1792-bytes .... Think about the archive API like a filesystem streams API. The file object just reads/writes data, and doesn't worry about how that get's stored on the media. This doesn't really cause any API problems. In fact, it ensures that the only thing that can know about the archive format is the archive object. We just need to figure out whether the archive object processes stream objects, or if the stream objects process themselves via the archive object. > 8. Checksumming streams: We should probably only worry about checksumming > buffer-sized chunks of data, not individual blocks of structured data. > Setup time for the CRC calculations can thus be reduced, as can the > overhead of the CRC checksum itself. Yes. This should be done on an IO block, and stored in the IO block header. It is useful to be able to tell the other ojects whether or not the block this data came from validated or not, so they know how much trust to put into it, but we don't want to have several checksums floating around. > > 9. I think Mr. Fish mentioned that we probably want an "end of file" block > in file streams so that we know we have reached the end of a file. This > simplifies some programming, I guess. Did I misread the message? I don't think that's required. We can flag a file EOF when we see a new stream header. It also means we can pack the stream data all the way to the end of the IO block. If we want, we can just add a flags byte to the stream data structure that indicates that "this is the last data block". > Okay, I think this is enough to think about. I am especially curious to > know what you think about the notion of putting variable-sized blocks into > bigger buffer-sized blocks. I think this solves many problems (we never > really know how much OS-specific data is going to be in file headers, for > example), but is somewhat more complex than fixed-size blocks like 'tar', > and yes, there is still some overhead in some cases (if we don't have > enough space at the end of a buffer for a block, that space is wasted). I like it. We actually waste less space than the fixed-chunk case. BRU wasted an average of 512-bytes per file for small files, and ~850 bytes per file for larger files. Packing multiple things into a fixed IO block prevents this. There will be some space wasted: for example, we can't (more specifically, don't want to!) split a stream header across an IO buffer boundary. But if we are writing stream data, we can adjust/buffer the data to exactly fill the buffer, and then just start the next IO buffer with the leftover stream data (and a new stream data header indicating the size). > Comments? We need some new terms. We are using 'streams' in the sense of archive streams and interplexing, and 'streams' in the sense of data streams to be archived. How about calling them something like d-streams and a-streams. A d-stream is a stream of data to be archived. An a-stream is an archive stream, and contains one or more d-streams. An archive is made up of one or more a-streams. -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |
|
From: Eric L. G. <er...@ba...> - 2001-07-03 04:37:19
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hmm, seems we have three different sizes to worry about:
1. Chunk size -- if we are chunking the data into, say, 128K chunks
2. Record size -- we have variable sized records within those chunks.
3. Tape I/O size -- this may be anything from 512 bytes all the way to
128K+ bytes. The only real criteria is that either chunk size is a
multiple of this, or this is a multiple of chunk size.
Things getting complicated. How to simplify?
- --
Eric Lee Green mailto:er...@ba...
BadTux: http://www.badtux.org
GnuPG public key at http://badtux.org/eric/eric.gpg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7QUmi3DrrK1kMA04RAqPXAJ4vsT44g9JTDaSZeHwB2BUpkMxyaACfRd48
SHVf0hAGMwX2pjopp/ZhbyM=
=fWMx
-----END PGP SIGNATURE-----
|
|
From: Eric L. G. <er...@ba...> - 2001-07-03 02:31:24
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've put the preliminary architecture documentation up at http://tapioca.sourceforge.net . That should give you a good idea of how this is structured. Aside from the 'pudding', the rest of the architecture should be somewhat familiar, except for the parts inflicted by Java (like the tape server being multi-threaded rather than multi-process). Now it's time to decide on the archive format. We need to get this right because changing it after we've started making backups will be a major pain in the %$#@@#. Some criteria: 1. Must be able to multiplex (and demultiplex) multiple streams into the same tape file. This indicates that the input needs to be blocked in a structured way, rather than just being a plain old stream of data, and that each block needs to be tagged with a stream ID. 2. Archives should have a header that includes the stream ID's (and descriptive labels) of every stream multiplexed into them, otherwise doing a manual restore of a multiplexed archive (e.g. to restore the central authority if it crashes) will be horribly difficult. When restoring by hand, a stream ID (or rather index -- e.g. we can tell it "--stream=1" and it knows to only dump stuff labeled with stream id=a00523134.342) can be used and it'll strip out everything except that stream on the restore. 3. The tape format should not require doing a MT_TELL for every bloody block written to tape, only for blocks that actually need it (i.e, blocks that contain the beginning of a piece of data logged into the database). This tends to indicate that blocks need tagging with a "type" field. 4. The format should be able to handle two things other than raw data blocks: a) producing location information suitable for logging into the central authority's location database for use in future restores, and b) holding any OS-specific data needed to fully restore the file. 5. The stream format will have to hold data about what kind of writer produced the data in the file, so that the file logger can properly account for the differences in display format and pass that data upstream to the user interface. We don't want to force Unix filename format onto Windows or Mac or etc.! Similarly, if we're backing up a database file dump stream (one possible data source) we don't want to have to pretend that it contains Unix-structured data, and we need to know it came from a database stream dumper rather than from a filesystem dumper, so that when we go to restore it we know what restorer to use! So each type of data stream creator will need a unique creator ID of some sort to tell us what kind of widget created the data stream, and this gets put into the header so that we can grab it and know what to restore this data stream with. 6. For volume changes, the full header information should be replicated on the new volume, along with what volume we're working on etc. so that if we have a tape that is a volume 2, we have more of a chance of associating it with the correct volume 1 if we have to do this by hand. 7. Fixed-size blocks, or variable-sized blocks? Fixed-sized blocks, like 'tar' uses, are easy to deal with, and can be easily packed into larger buffers (as long as said larger buffers are a multiple of the blocksize in length). However, each block adds overhead. If the block size is too small, overhead becomes too much of a percentage of the block. If the block size is too large, then we have too much wasted space at the end of the block. Variable-sized blocks could be used, but we could require that these be packed into a fixed-size buffer of some large size (perhaps 64K or 128K) such that each buffer begins with a block and no block spans buffers. This is a pain, but results in less wasted space and thus better performance in the end. Note that if we limit the variable-sized blocks to 32k in size, we can represent the size of the block with only 2 bytes in the block's header. 8. Checksumming streams: We should probably only worry about checksumming buffer-sized chunks of data, not individual blocks of structured data. Setup time for the CRC calculations can thus be reduced, as can the overhead of the CRC checksum itself. 9. I think Mr. Fish mentioned that we probably want an "end of file" block in file streams so that we know we have reached the end of a file. This simplifies some programming, I guess. Did I misread the message? Okay, I think this is enough to think about. I am especially curious to know what you think about the notion of putting variable-sized blocks into bigger buffer-sized blocks. I think this solves many problems (we never really know how much OS-specific data is going to be in file headers, for example), but is somewhat more complex than fixed-size blocks like 'tar', and yes, there is still some overhead in some cases (if we don't have enough space at the end of a buffer for a block, that space is wasted). Comments? Once we have tossed around the criteria, we can come up with some possible tape layouts, and then I'll be happy to write a spec for it and put it into the CVS archive. I'm currently working on a spec/RFC type template that we can use for that (it's not in the CVS archive yet though). - -- Eric Lee Green mailto:er...@ba... BadTux: http://www.badtux.org GnuPG public key at http://badtux.org/eric/eric.gpg -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE7QSwj3DrrK1kMA04RAvIvAJ0bz1uc+MWd8fHLd4BGJngMl8lA7QCeNb8L NT6WVcEtcAOYFbasYOEFCcU= =2S0P -----END PGP SIGNATURE----- |