[Openqueue-development] Re: OpenQueue

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Paul, this is good feedback.  All my comments below are IMHO, of course.

-Matt Jensen

On Mon, 1 May 2000, Paul Lindsay Matthews wrote:

>I am also working on a queuing system (called gnuq). It's
>not on source forge yet as work is still occurring on the
>protocol and behavioural specifications, as well as the
>back end parts of the system.

Excellent, Paul.  I look forward to hearing more about it.  Similarly, let
me note that OpenQueue is not set in stone, and one reason we announced it
now is to encourage feedback like yours.

>Please don't take this the wrong way but parts of the
>technical document seem quite naive and repeatedly stress
>the use of inappropriate technology.

>If you wish, I can send you a copy of both "MQSeries:
>An Introduction to Messaging and Queuing" and "MQSeries:
>Application Programming Guide" as acrobat (.pdf) documents.
>These should put you on the right track.  Just drop me a
>line if you can't get them off the IBM web site yourself.

Thanks, I've used MQSeries, MSMQ, and several JMS implementations.  The
motivation of OpenQueue was to provide an Open Source solution of the same
caliber as these commercial products (the same general goal you have with
gnuq, I expect).

>A message and queuing system works mostly around guaranteed
>delivery of a request, and the guaranteed delivery of
>responses. You got to think about banks and ATM's,
>not NNTP servers!

Agreed.  That's why OpenQueue uses a two-phase commit to guarantee
delivery.  After you get sent an "UPDATE" message, break your connection
to the server instead of sending an "ACK".  You'll find upon reconnecting
that those unacknowledged messages are still there, waiting to be resent
in a new transaction.

>Why limit the sending of data to MIME times? What is the
>MIME type of an EFPOS transaction, or a single line from
>your fire wall log? Your message and queuing system should
>not care what is being sent though it, only that it arrives
>reliably, correctly and quickly. For example MQSeries sees
>all it's data as opaque blobs.

You can send any type of data you want.  If it's a well-defined MIME type
such as a GIF file or XML, then the MIME type will describe that to
clients and make it easier for them to use.  But if it's your own data
format, why not make the MIME type "application/octet-stream"?  Then just
stream the bytes after the message headers.  That's what I'd do for your
EFPOS (point of sale) message, but a line from the firewall log might just
be sent as simple text.

In short, you can send opaque blobs in OpenQueue, but there's also a
method of describing the data type if desired.  And sending XML messages
offers even more benefits, if you choose to do so.

>The designers of the HTTP protocol did not know what they
>where doing, and as such there is no reason to follow their
>bad example.
>
>One particular instance of this is the use of a text based
>protocol. The first problem with text based protocols is
>defining exactly what you mean by text. Is it 7bit US ASCII,
>or is it 7bit UK ASCII, is it 8bit ISOLatin1, what happens
>if I pass you a UTF-8 encoded string, if you handle UTF-8
>do you handle just 16 bit Unicode or do you handle 32 bit
>Unicode? Do you realise that some keyboards have neither
>a ! or a # sign on them?

[First, if those are normal user keyboards, then maybe OpenQueue should
change the characters in use.  But if you're talking about ATM machines or
pagers, I'm not concerned that a user can't write the protocol out
directly, because those devices will have client software to do that.  
The point of a text-based protocol is more that a programmer can enter the
protocol from their own (normal) keyboard, as well as generate textual
tests and tools easily.]

The original HTTP got some things wrong, but one thing I strongly feel
they got *right* was that they made it a text-based protocol.  Or as I
prefer to call it, a human-readable protocol.  This is a primary lesson of
Internet protocols in use, that those that are human-readable (HTTP, SMTP,
POP3, IMAP, NNTP, LDAP, XML) are more likely to be implemented (with many
independent implementations), more likely to have supporting utilities and
code, and are easier to debug, than binary formats.  

For example, the fact that XML is human-readable is one of its biggest
selling points.  People can very quickly write parsers, data generators,
wrappers to SQL record sets, etc.  If XML were a binary meta-format
instead of a human-readable meta-format, it would be going nowhere.

Character sets can be defined in the headers, and IETF requires new
protocols to make provisions to support internationalization, etc.  I
don't see this as something that is stopping development of new
human-readable protocols.

There are levels at which you need the performance of binary formats, such
as with IP, TCP, and SSL.  But message queuing is an application-level
problem, at least in the IETF's view of things, and if you look at the
protocols in their Applications Area (
http://www.ietf.org/html.charters/wg-dir.html#Applications_Area ), you'll
find that of the 23 working groups there, only 2 are using binary formats
(Internet Fax, and Telnet TN3270 Enhancements), and that's because of
interoperability issues with older formats. The other 21 working groups
all use human-readable formats, as far as I can see.

>The second problem with text based protocols is that
>implementations are often sloppy. Look how many people have
>got the HTTP protocol wrong! You need to define you
>protocol to the byte and leave no room for ambiguity or
>error.

I submit to you that a human-readable protocol is easier to debug.  A
binary protocol *also* needs to be defined to the byte and be unambiguous,
and it's much harder to see the bad byte in a page full of hex than in a
page full of ( SMTP | HTTP | XML | ...).

>The third problem with a text based protocol is parsing the
>damn thing. Efficient and compact parsing and handling
>cannot be done on a text based protocol.

I completely agree that binary formats can be faster to process.  That's
their selling point.  But I think the experience of the 'net so far is
that processing those human-readable commands is pretty cheap when
compared to a) the network bandwidth bottleneck, or b) time spent building
and debugging your binary tools.

>You already have
>an efficient and easily usable marshalling system in Java
>called DataOutputStream. These stream can be read from C
>fairly trivially. (If you like I can send you the C code
>to do it. It even handles UTF-8 encoded strings. It's the
>next thing to go on to source forge.)

Now you're really talking about the data in the messages, not the protocol
that sends messages back and forth.  I could send you an OpenQueue message
(or an HTTP message, or an SMTP message....) that contained either plain
text, text-based XML, a serialized Java object, your EFPOS blob, or
anything else.  The content of the message should be independent of the
communications protocol.

[Side note: Are you saying gnuq is going to be based on serialized
Java(tm) objects?  Then is it an implementation of Java Message Service
(JMS)?  That's great, the world needs a good, Open Source JMS
implementation.  But if that's supposed to be the core open standard for
message queuing, it leaves it tied to Sun (notice the trademark above :-).
One solution is to define a wire-level protocol such as OpenQueue, and
then you can implement Java classes on top of it to provide a JMS
implementation; you get JMS compatibility, but your underlying system can
also have wrappers to other languages.  My apologies if I am
misunderstanding your point/intentions about Java.]

>Now on to authentication. I'm afraid that strong public key
>authentication will need to be part of the protocol.  Not
>only do you have to handle the secure delivery of messages,
>you need to ensure that the result get returned to the
>correct client. Even if that client has disconnected. And
>that cannot be done without strong public key
>authentication.

OpenQueue security needs a lot of work, absolutely.  I'd like to see it
leverage external tools as much as possible.  I'm not clear what you mean
by "Even if that client has disconnected."  Do you mean when the client
reconnects and re-authenticates?  Or do you mean sending data
asynchronously to a client which is not currently connected?

>The transaction support does not look correct to me at all.
>Read the MQ books. Remember transactions may need to be
>distributed across multiple machines in different parts
>of the world.

Currently in OpenQueue you can receive multiple messages in a transaction.
You can currently only send one message at a time.  (Each operation is
done with a two-phase commit.)  The plan is to add the ability to wrap
SEND commands in transactions, too. 

>Other notes, reliability is actually your only priority.
>You cannot acknowledge the receipt of message until you have
>written it to non-volatile storage and synced it. Having a
>queue stored in memory would be quite exceptional and
>dangerous. (Loosing $10 million dollar transactions won't
>make you popular.)

Are you saying there's no use for queues that aren't reliably persistent?
Maybe not for you, but for enough users of MSMQ, JMS providers, etc. that
they provide support for that option.  In cases where you need more speed,
and you're willing to bet (for this application) that the hard drive won't
crash, some people do it this way.   But if you only want persistent
queues, then create all of yours as persistent.

>How where you planning on storing messages? In files, or
>in something like sleepycat db (which comes this glibc
>these days)? You need a data store which supports
>roll-forward recovery, the ability roll-back
>transactions, and has no window of opportunity for
>error.

Message storage is implementation-dependent.  The Passamaquoddy server for
OpenQueue ( download at
http://sourceforge.net/project/filelist.php?group_id=2345 ) uses JDBC to
store messages on any SQL database with a JDBC driver.  Whether it
properly rolls back operations that fail should be an issue for its code,
if the OpenQueue protocol itself offers proper transactional support.

>--
>Paul Matthews