[Mod-pubsub-developer] Pushlets and Mod_pubsub and getting connections.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Just wrote:
> Mod_pubsub: yes (in fact the author of mod_pubsub credited Pushlets as

> the inventor of HTTP/JavaScript streaming!). I played with their demos

> and captured the HTTP conversation (with tcpflow) but the mechanism=20
> appears the same as the Pushlet (very much the same in the sense that=20
> mod_pubsub has almost identical JS callbacks like parent.push()!).
Their=20
> protocol is more elaborate. As far as I can see mod_pubsub would have
to=20
> deal with permanent HTTP connections as well. On the other hand Apache

> itself is very scalable.

Just, thanks for showing us the way with HTTP/JavaScript streaming.
You've done the world a great service by educating people as to how
this technique is accomplished, and we've always admired Pushlets
(the project and the person) from afar.

An interesting piece of history: We only discovered your earlier work
in Pushlets after we had already solidifed our API and programming
model.  It's an amazing coincidence that the mechanisms Pushlets and
mod_pubsub use are so similar -- maybe proof that great (humble) minds
think alike.  :)

And yes, we do deal with permanent HTTP connections as well.
As a result we can get some pretty good scale (see below).

A key difference is that in mod_pubsub, a single push connection can
be shared by several open browser windows in the same browser process.
We also support a variety of non-JavaScript clients, from Python to
Perl to PHP, and Java and Ruby and C.  For these clients we have a
simpler data format that doesn't require JavaScript or HTML parsing.

On the scalability question, the mod_pubsub Python server is
theoretically
quite scalable, since it doesn't use separate threads for different
connections.  However, this has not been tested beyond a few dozen.
(We do note that http://www.mod-pubsub.org/blog/ now regularly has
between 3 and 30 sessions connected to it at a time, and the server
seems to handle that condition fairly well.)

We would love to figure out if there's any way the Pushlets and
mod_pubsub
projects could help each other.  Any suggestions?

   -- Adam

P.S. -- Based on experience with commercial HTTP push servers at

    http://www.knownow.com/=20

we have confidence that the techniques used in mod_pubsub can scale
to several thousand concurrent connections per server instance.

-----Original Message-----
From: pu...@ya... [mailto:pu...@ya...]=20
Sent: Tuesday, January 13, 2004 12:35 AM
To: pu...@ya...
Subject: [pushlet] Digest Number 266

There is 1 message in this issue.

Topics in this digest:

      1. Re: Digest Number 264
           From: Just van den Broecke
________________________________________________________________________
________________________________________________________________________

Message: 1
   Date: Mon, 12 Jan 2004 21:22:39 +0100
   From: Just van den Broecke
Subject: Re: Digest Number 264

Neil Benn wrote:
> Is there not a way to close the thread / socket on the server while=20
> 'duping' the client into listening for more data?  I'd guess that this

> would require coding at a lower level than the standard Servlet API.
Well this could be done by the Pullet mode I described. I think I also=20
mentioned Java NIO which could minimize the number of threads required,=20
but this is at a lower level.
> Does the HTTP 1.1
> mechanism include low-level client to server 'polling' to ensure that=20
> the server socket is open?
No. It is even the case that protocols like used in RMI/CORBA/DCOM all=20
have a polling/ping protocol to check that socket is open.
  Is it just wishful thinking to want servers to clear
> resources whilst clients happily wait for more data?  I took a look at

> mod-pubsub a while back, which is available as either an Apache module

> or a python program - I wasn't sure if this software manages to solve=20
> the scalability problems you mentioned - maybe you  (Just) have looked

> at this too?
Mod_pubsub: yes (in fact the author of mod_pubsub credited Pushlets as=20
the inventor of HTTP/JavaScript streaming!). I played with their demos=20
and captured the HTTP conversation (with tcpflow) but the mechanism=20
appears the same as the Pushlet (very much the same in the sense that=20
mod_pubsub has almost identical JS callbacks like parent.push()!). Their

protocol is more elaborate. As far as I can see mod_pubsub would have to

deal with permanent HTTP connections as well. On the other hand Apache=20
itself is very scalable.
> ----- Original Message -----
> From: <pu...@ya...>
> To: <pu...@ya...>
> Sent: Friday, January 09, 2004 9:31 AM
> Subject: [pushlet] Digest Number 264
>=20
>>   Date: Thu, 08 Jan 2004 11:44:30 +0100
>>   From: Just van den Broecke
>>Subject: Re: Q about scalability
>>
>>Maybe "hog resources" is a too negative connotation. It is simply a=20
>>fact that _any_ push technology where clients are permanently=20
>>connected through a TCP connection will require resources like sockets

>>and threads, usually one pair per connection (except for=20
>>"select()"-type Sockets and in Java NIO).
>>
>>Pushlets run in a Java servlet engine (e.g. Tomcat). A servlet engine=20
>>is designed to handle short/stateless HTTP requests. Since most=20
>>browsers use HTTP/1.1, one or two permanent TCP socket connections are

>>maintained per browser over which multiple HTTP requests are done. The

>>servlet engine usually employs a "thread-per-request" strategy where a

>>single thread is allocated for each request. Usually (like in Tomcat)=20
>>a thread is fetched from a thread-pool and returned after the request.

>>Pushlets will maintain a permanent connection and thus will not return

>>that Thread to the pool as long as the client remains connected. The=20
>>amount of threads in the pool is usually configurable (e.g. server.xml

>>in
>>Tomcat) even per HTTP connector/engine (Tomcat allows multiple
engines).
>>So a standard configuration (usually between 10-100 threads) will
>>quickly exhaust the pool when more Pushlet connections than that
amount
>>are present (this refers to the "hog"). So if you are able to
configure
>>a server with a thread pool of 30,000 threads you may be able to reach
>>that scale (if your server is also able to maintain 30000 sockets!).
>>This is probably not the way to go. IMO linear scalability should be
>>sought in deploying parallel servers. For example by using an Apache
>>front-end master-server that will redirect requests to a random or
>>round-robin pushlet server.
>>
>>Apart from this there are changes I plan to make to the Pushlet=20
>>framework in particular the protocol. There is currently a separate=20
>>servlet called the Pullet which ends the HTTP request after fetching=20
>>events from the per-client queue. I have used a variant of the Pullet=20
>>in a large scale project (www.rabotreasuryweb.com) to reach a much=20
>>higher scalability than with Pushlets. Between requests clients wait a

>>random
>>(configurable) amount of time. The wait-time is also dynamic dependent
>>on the server-load. What I foresee is a merge between Pushlet and
Pullet
>>where different protocol-modes can be selected dependent on the
>>application, the amount of events etc and that certain variables like
>>the wait-time on the client and/or server can be adapted even
>>dynamically. Note that the Pullet is not the same as a poll (refresh),
>>the request is still held until an event becomes available. More
>>ambitious would be a dedicated Pushlet server whose design is
optimized
>>for long-lived HTTP requests, possibly using Java Non-blocking IO
(NIO)
>>to preserve thread-usage.
>>
>>Maybe not the answer you expected. Scalability depends on more=20
>>variables than the ones you gave. For example the size/frequency of=20
>>events.
>>
>>best,
>>
>>Just
>>
>>
>>
>>Bryan Martinez wrote:
>>
>>>happy new year to you too just!
>>>
>>>in the documentation you mentioned that when 100-'s of clients are=20
>>>connected through pushlets they hog resources... what rating of a=20
>>>single server can confidently handle 10,000/20,000/30,000 pushlet
>>>connections? like how many and what type of
>>>processor(s) and amount of RAM needed to support it?
>>>or is this simply impossible?