[Openqueue-development] Re: OpenQueue
Status: Alpha
Brought to you by:
mattj
From: <ma...@ne...> - 2000-05-01 19:44:51
|
Paul, this is good feedback. All my comments below are IMHO, of course. -Matt Jensen On Mon, 1 May 2000, Paul Lindsay Matthews wrote: >I am also working on a queuing system (called gnuq). It's >not on source forge yet as work is still occurring on the >protocol and behavioural specifications, as well as the >back end parts of the system. Excellent, Paul. I look forward to hearing more about it. Similarly, let me note that OpenQueue is not set in stone, and one reason we announced it now is to encourage feedback like yours. >Please don't take this the wrong way but parts of the >technical document seem quite naive and repeatedly stress >the use of inappropriate technology. >If you wish, I can send you a copy of both "MQSeries: >An Introduction to Messaging and Queuing" and "MQSeries: >Application Programming Guide" as acrobat (.pdf) documents. >These should put you on the right track. Just drop me a >line if you can't get them off the IBM web site yourself. Thanks, I've used MQSeries, MSMQ, and several JMS implementations. The motivation of OpenQueue was to provide an Open Source solution of the same caliber as these commercial products (the same general goal you have with gnuq, I expect). >A message and queuing system works mostly around guaranteed >delivery of a request, and the guaranteed delivery of >responses. You got to think about banks and ATM's, >not NNTP servers! Agreed. That's why OpenQueue uses a two-phase commit to guarantee delivery. After you get sent an "UPDATE" message, break your connection to the server instead of sending an "ACK". You'll find upon reconnecting that those unacknowledged messages are still there, waiting to be resent in a new transaction. >Why limit the sending of data to MIME times? What is the >MIME type of an EFPOS transaction, or a single line from >your fire wall log? Your message and queuing system should >not care what is being sent though it, only that it arrives >reliably, correctly and quickly. For example MQSeries sees >all it's data as opaque blobs. You can send any type of data you want. If it's a well-defined MIME type such as a GIF file or XML, then the MIME type will describe that to clients and make it easier for them to use. But if it's your own data format, why not make the MIME type "application/octet-stream"? Then just stream the bytes after the message headers. That's what I'd do for your EFPOS (point of sale) message, but a line from the firewall log might just be sent as simple text. In short, you can send opaque blobs in OpenQueue, but there's also a method of describing the data type if desired. And sending XML messages offers even more benefits, if you choose to do so. >The designers of the HTTP protocol did not know what they >where doing, and as such there is no reason to follow their >bad example. > >One particular instance of this is the use of a text based >protocol. The first problem with text based protocols is >defining exactly what you mean by text. Is it 7bit US ASCII, >or is it 7bit UK ASCII, is it 8bit ISOLatin1, what happens >if I pass you a UTF-8 encoded string, if you handle UTF-8 >do you handle just 16 bit Unicode or do you handle 32 bit >Unicode? Do you realise that some keyboards have neither >a ! or a # sign on them? [First, if those are normal user keyboards, then maybe OpenQueue should change the characters in use. But if you're talking about ATM machines or pagers, I'm not concerned that a user can't write the protocol out directly, because those devices will have client software to do that. The point of a text-based protocol is more that a programmer can enter the protocol from their own (normal) keyboard, as well as generate textual tests and tools easily.] The original HTTP got some things wrong, but one thing I strongly feel they got *right* was that they made it a text-based protocol. Or as I prefer to call it, a human-readable protocol. This is a primary lesson of Internet protocols in use, that those that are human-readable (HTTP, SMTP, POP3, IMAP, NNTP, LDAP, XML) are more likely to be implemented (with many independent implementations), more likely to have supporting utilities and code, and are easier to debug, than binary formats. For example, the fact that XML is human-readable is one of its biggest selling points. People can very quickly write parsers, data generators, wrappers to SQL record sets, etc. If XML were a binary meta-format instead of a human-readable meta-format, it would be going nowhere. Character sets can be defined in the headers, and IETF requires new protocols to make provisions to support internationalization, etc. I don't see this as something that is stopping development of new human-readable protocols. There are levels at which you need the performance of binary formats, such as with IP, TCP, and SSL. But message queuing is an application-level problem, at least in the IETF's view of things, and if you look at the protocols in their Applications Area ( http://www.ietf.org/html.charters/wg-dir.html#Applications_Area ), you'll find that of the 23 working groups there, only 2 are using binary formats (Internet Fax, and Telnet TN3270 Enhancements), and that's because of interoperability issues with older formats. The other 21 working groups all use human-readable formats, as far as I can see. >The second problem with text based protocols is that >implementations are often sloppy. Look how many people have >got the HTTP protocol wrong! You need to define you >protocol to the byte and leave no room for ambiguity or >error. I submit to you that a human-readable protocol is easier to debug. A binary protocol *also* needs to be defined to the byte and be unambiguous, and it's much harder to see the bad byte in a page full of hex than in a page full of ( SMTP | HTTP | XML | ...). >The third problem with a text based protocol is parsing the >damn thing. Efficient and compact parsing and handling >cannot be done on a text based protocol. I completely agree that binary formats can be faster to process. That's their selling point. But I think the experience of the 'net so far is that processing those human-readable commands is pretty cheap when compared to a) the network bandwidth bottleneck, or b) time spent building and debugging your binary tools. >You already have >an efficient and easily usable marshalling system in Java >called DataOutputStream. These stream can be read from C >fairly trivially. (If you like I can send you the C code >to do it. It even handles UTF-8 encoded strings. It's the >next thing to go on to source forge.) Now you're really talking about the data in the messages, not the protocol that sends messages back and forth. I could send you an OpenQueue message (or an HTTP message, or an SMTP message....) that contained either plain text, text-based XML, a serialized Java object, your EFPOS blob, or anything else. The content of the message should be independent of the communications protocol. [Side note: Are you saying gnuq is going to be based on serialized Java(tm) objects? Then is it an implementation of Java Message Service (JMS)? That's great, the world needs a good, Open Source JMS implementation. But if that's supposed to be the core open standard for message queuing, it leaves it tied to Sun (notice the trademark above :-). One solution is to define a wire-level protocol such as OpenQueue, and then you can implement Java classes on top of it to provide a JMS implementation; you get JMS compatibility, but your underlying system can also have wrappers to other languages. My apologies if I am misunderstanding your point/intentions about Java.] >Now on to authentication. I'm afraid that strong public key >authentication will need to be part of the protocol. Not >only do you have to handle the secure delivery of messages, >you need to ensure that the result get returned to the >correct client. Even if that client has disconnected. And >that cannot be done without strong public key >authentication. OpenQueue security needs a lot of work, absolutely. I'd like to see it leverage external tools as much as possible. I'm not clear what you mean by "Even if that client has disconnected." Do you mean when the client reconnects and re-authenticates? Or do you mean sending data asynchronously to a client which is not currently connected? >The transaction support does not look correct to me at all. >Read the MQ books. Remember transactions may need to be >distributed across multiple machines in different parts >of the world. Currently in OpenQueue you can receive multiple messages in a transaction. You can currently only send one message at a time. (Each operation is done with a two-phase commit.) The plan is to add the ability to wrap SEND commands in transactions, too. >Other notes, reliability is actually your only priority. >You cannot acknowledge the receipt of message until you have >written it to non-volatile storage and synced it. Having a >queue stored in memory would be quite exceptional and >dangerous. (Loosing $10 million dollar transactions won't >make you popular.) Are you saying there's no use for queues that aren't reliably persistent? Maybe not for you, but for enough users of MSMQ, JMS providers, etc. that they provide support for that option. In cases where you need more speed, and you're willing to bet (for this application) that the hard drive won't crash, some people do it this way. But if you only want persistent queues, then create all of yours as persistent. >How where you planning on storing messages? In files, or >in something like sleepycat db (which comes this glibc >these days)? You need a data store which supports >roll-forward recovery, the ability roll-back >transactions, and has no window of opportunity for >error. Message storage is implementation-dependent. The Passamaquoddy server for OpenQueue ( download at http://sourceforge.net/project/filelist.php?group_id=2345 ) uses JDBC to store messages on any SQL database with a JDBC driver. Whether it properly rolls back operations that fail should be an issue for its code, if the OpenQueue protocol itself offers proper transactional support. >-- >Paul Matthews |