Re: [Queue-developers] new design details

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Progamming is partially an artform as much as a
science, and it's usually best to try to use the most
modern techniques that everyone else is using. This
way, you can leverage off what other people in the
community are doing, and develop synergy with other
projects. GQ was modern for its time, but lots has
happened since it came out.

1.

In certain modern environments (Java or C#/Mono),
using SOAP/XML actually simplifies things. The
complexity gets shifted to the libraries.

Moreover, it adds the ability to then become a
"standard" that work together with other environments
--- open the possibility of easily making GQ work in
non-homogenous environments, something that was much
requested by the users.

Check out Ganglia, another open source project,
originally developed at UCB.
http://ganglia.sourceforge.net/

This is becoming one of the standards for determining
load averages on remote machines. Meta-clustering
systems, such as APST, recognize this as one of the
protocols for querying load averages and other
information.

So, it would be good if GQ supported a standard
monitoring protocol like ganglia. You'd have to run
ganglia somewhere anyway if you wanted to use a meta
system like apst, so GQ could feed off that, rather
than, or in addition to, its own load monitoring code.
Or, GQ could continue to use its own monitoring code,
but also support the ganglia protocol (it's open
source, after all), so there wouldn't be a need to run
two load monitoring daemons.

Ganglia exchanges using information using XML, of
course. 

2.

Regarding your concerns about SQL scalability: a lot
of work was gone into making SQL environments highly
scalable --- its a huge issue for corporations and
anyone trying to run a high-traffic, mission critical
website (been there). Remember the old commercials for
a certain computer services firm about the startup
that didn't consult with them, and therefore its
website wasn't scalable? mysql (which RMS wouldn't
want us to mention here because it is semi-commercial
--- he wants us to say 'postgresql' because that's
completely free), may not be one of them, but the
major two commercial SQL are designed to be highly
scalable in a cluster environment (this is why one
pays big bucks in support and fees for them instead of
mysql).

The other solution, commonly used in the website world
(J2EE and .NET/Mono), involves so-called 3-tiered
architectures.

Basically, the idea is to have another set of threads,
which can potentially run on another CPU or another
machine, handle all the actual communication with
client (web browser) and cache some of the SQL queries
(pre-compilation), and, in some cases, even cache the
results when possible. This takes a significant load
off the back-end SQL database, which can now handles
1000s more clients. 

This approach would work with a GQ mananger by
creating an intermediary gm daemon that would reply to
a large number of clients by caching and periodically
refreshing the results of a small number of SQL
queries.

SQL databases are at the leading edge of scalability
technology (although often in the commercial rather
than open source worlds) and have other benefits
(standardization, so other clients can interact, and
existing SQL database management tools can be used).

Still, I'll let you decide how you think it's best to
do this.

--- Koni <mh...@co...> wrote:

> On Tue, 2005-05-10 at 08:56 -0700, wernerkrebs
> wrote:
> > 
> > Two comments.
> > 
> > 1. Regarding the protocol, GQ's protocols largely
> > predated modern RPC standards, such as SOAP and
> XML. 
> > 
> 
> I'm not sure any of these things are worth their
> weight in a homogeneous
> system. The communication between the GQ system as I
> have envisioned it
> is pretty lightweight and there is very little
> structure to the
> information. In this case, I think using XML or SOAP
> for a communication
> layer adds complexity (in my mind) which is contrary
> to their purpose in
> general. 
> 
> [snip]
> 
> > I would think some of the current features of the
> GQ
> > TCP/IP protocol would be best done using some sort
> of
> > SOAP implementation. For example, aspects of the
> > initial authentication, and querying load
> information
> > would be best done using SOAP.
> > 
> 
> I don't think SOAP will do much for us regarding
> authentication. The
> authentication stuff here is really simple (to me).
> Perhaps for load
> information if a lot of detail is returned (like all
> the information ps
> would return say). As for authentication, its
> already implemented as a
> simple challenge handshake (initial authentication):
> 
> qd                              qm
> 
> auth/register request
> (send nonce)          -------->
> 
>                                 sign nonce with
> system key,
>                       <-------  reply with our own
> nonce
> 
> verify response       --------> 
> sign qm nonce
> 
>                       <--------  verify response,
> send session key
> 
> 
> If either verification fails, the offended party
> stops the protocol.
> Receipt of the session key indicates to qd that the
> challenge handshake
> protocol completed successfully. After that, all
> communication between
> the qd and qm come with simple signatures using that
> key. The complexity
> of the generation of signatures and verification of
> them is already more
> or less isolated from the logic if handling the
> message payload. 
> 
> 
> 
> > Also, since GQ was written, standard protocols for
> > this type of thing have emerged. Look at
> Apst/Apstd
> > system at SDSC (where, ironically, I used to work,
> > although not on that project):
> > 
> > http://grail.sdsc.edu/projects/apst/
> > 
> > Apst is a meta demon for cluster demons. It
> doesn't
> > currently support starting jobs using GQ, but does
> > support starting other (commerical) systems. GQ
> > support would be fairly trivial for them to add,
> if
> > they wanted to. SDSC (part of UCSD) receives grant
> > money from a firm that makes a GQ-like commercial
> > product, so it's not clear if that's a direction
> they
> > want to go in. They do support the commerical
> product.
> > However, the source code is available, so the
> > community is free to add support for GQ as well.
> > 
> > Apst will query each cluster manager (this would
> > similar to the qm program you are proposing) and
> > obtain load information via an XML file returned
> from
> > the cluster manager. It will then decide how many
> jobs
> > to start on that particular cluster (which it will
> > start using a crude ssh command-line protocol to
> > submit the jobs and scp to first transfer the
> relevant
> > files into place). It's up to the cluster manager
> to
> > then distribute the jobs to the cluster nodes.
> > 
> > Apst, which is C/C++ based (Apstd is available in
> > Java) is similar to Nimrod, which is Java-based.
> > Source code for all of these is available.
> 
> This sounds interesting. It would be great for GQ,
> whether GQ becomes my
> new proposed implementation, remains as is, or
> something else
> altogether, contributing a "driver" (so to speak) so
> that this meta
> system can work with it would be cool and perhaps
> broaden the market for
> us.
> 
> > 
> > 2. Regarding qm, a divison of the Texas
> Instruments
> > actually contributed a SQL-based qm in C++. (It
> would
> > require that an SQL database, preferably Open
> Source
> > and free such as Postgresql, be running on a
> server).
> > 
> 
> Cool. I was first thinking about job information
> being managed by a
> mysql (or postgres) backend, where the SQL engine
> would handle things
> like atomicity and persistent state information
> across failure. Would
> have been cake if I wrote qm in perl (I am very
> familiar with Perl-DBI).
> The only thing I don't like about this is the
> potential high-latency --
> one (or more) threads insert to the job table (qs)
> while some another
> thread polls (qm) the table for new rows. Perhaps in
> postgres there is a
> way to install a trigger or something so polling is
> unnecessary. I don't
> think there is a way to do that in mysql. qm is
> actually unnecessary if
> qd's can talk to the SQL engine directly. SQL can
> handle authentication
> and atomicity and qd's can just compete for jobs.
> That's kind of nice.
> Not sure it will scale well though. 1000 qd's each
> with persistent TCP
> connection to mysql would create 1000 forked
> processes at the database
> server. 
> 
> 
> > This is part of the GQ distribution, but is
> optional
> > and not compiled by default (due to C++ autoconf
> > problems at the time since resolved. Also, users
> wrote
> > to me explaining their preference for a small,
> simple
> > package with peer-to-peer behavior, rather than a
> > centralized package with a manager that might
> crash,
> > so the original behavior of GQ remained the
> default.)
> > 
> > Beforing writing a manager from scratch, you might
> > want to look at the manager code and documentation
> > that TI's subsidary contributed.
> 
> OK, I'll try to have a look. The manager is almost
> already all written
> though in my haste to flesh out ideas rolling around
> in my head. I shall
> post a tarball of the code shortly. I want to add at
> least a rudimentary
> support for actually submitting a job to the system
> and having it
> execute. While I'm doing that, we can get a better
> feel for who is out
> there reading this list and what interest there is. 
> 
> Thanks for your comments Werner, I appreciate your
> insights greatly.
> 
> Cheers,
> Koni
> 
> 
> 
>
-------------------------------------------------------
> This SF.Net email is sponsored by Oracle Space
> Sweepstakes
> Want to be the first software developer in space?
> Enter now for the Oracle Space Sweepstakes!
>
http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
> _______________________________________________
> Queue-developers mailing list
> Que...@li...
> To unsubscribe, subscribe, or set options:
>
https://lists.sourceforge.net/lists/listinfo/queue-developers
>