From: Adam R. <Ad...@Kn...> - 2004-01-14 03:19:22
|
Just wrote: > Mod_pubsub: yes (in fact the author of mod_pubsub credited Pushlets as > the inventor of HTTP/JavaScript streaming!). I played with their demos > and captured the HTTP conversation (with tcpflow) but the mechanism=20 > appears the same as the Pushlet (very much the same in the sense that=20 > mod_pubsub has almost identical JS callbacks like parent.push()!). Their=20 > protocol is more elaborate. As far as I can see mod_pubsub would have to=20 > deal with permanent HTTP connections as well. On the other hand Apache > itself is very scalable. Just, thanks for showing us the way with HTTP/JavaScript streaming. You've done the world a great service by educating people as to how this technique is accomplished, and we've always admired Pushlets (the project and the person) from afar. An interesting piece of history: We only discovered your earlier work in Pushlets after we had already solidifed our API and programming model. It's an amazing coincidence that the mechanisms Pushlets and mod_pubsub use are so similar -- maybe proof that great (humble) minds think alike. :) And yes, we do deal with permanent HTTP connections as well. As a result we can get some pretty good scale (see below). A key difference is that in mod_pubsub, a single push connection can be shared by several open browser windows in the same browser process. We also support a variety of non-JavaScript clients, from Python to Perl to PHP, and Java and Ruby and C. For these clients we have a simpler data format that doesn't require JavaScript or HTML parsing. On the scalability question, the mod_pubsub Python server is theoretically quite scalable, since it doesn't use separate threads for different connections. However, this has not been tested beyond a few dozen. (We do note that http://www.mod-pubsub.org/blog/ now regularly has between 3 and 30 sessions connected to it at a time, and the server seems to handle that condition fairly well.) We would love to figure out if there's any way the Pushlets and mod_pubsub projects could help each other. Any suggestions? -- Adam P.S. -- Based on experience with commercial HTTP push servers at http://www.knownow.com/=20 we have confidence that the techniques used in mod_pubsub can scale to several thousand concurrent connections per server instance. -----Original Message----- From: pu...@ya... [mailto:pu...@ya...]=20 Sent: Tuesday, January 13, 2004 12:35 AM To: pu...@ya... Subject: [pushlet] Digest Number 266 There is 1 message in this issue. Topics in this digest: 1. Re: Digest Number 264 From: Just van den Broecke ________________________________________________________________________ ________________________________________________________________________ Message: 1 Date: Mon, 12 Jan 2004 21:22:39 +0100 From: Just van den Broecke Subject: Re: Digest Number 264 Neil Benn wrote: > Is there not a way to close the thread / socket on the server while=20 > 'duping' the client into listening for more data? I'd guess that this > would require coding at a lower level than the standard Servlet API. Well this could be done by the Pullet mode I described. I think I also=20 mentioned Java NIO which could minimize the number of threads required,=20 but this is at a lower level. > Does the HTTP 1.1 > mechanism include low-level client to server 'polling' to ensure that=20 > the server socket is open? No. It is even the case that protocols like used in RMI/CORBA/DCOM all=20 have a polling/ping protocol to check that socket is open. Is it just wishful thinking to want servers to clear > resources whilst clients happily wait for more data? I took a look at > mod-pubsub a while back, which is available as either an Apache module > or a python program - I wasn't sure if this software manages to solve=20 > the scalability problems you mentioned - maybe you (Just) have looked > at this too? Mod_pubsub: yes (in fact the author of mod_pubsub credited Pushlets as=20 the inventor of HTTP/JavaScript streaming!). I played with their demos=20 and captured the HTTP conversation (with tcpflow) but the mechanism=20 appears the same as the Pushlet (very much the same in the sense that=20 mod_pubsub has almost identical JS callbacks like parent.push()!). Their protocol is more elaborate. As far as I can see mod_pubsub would have to deal with permanent HTTP connections as well. On the other hand Apache=20 itself is very scalable. > ----- Original Message ----- > From: <pu...@ya...> > To: <pu...@ya...> > Sent: Friday, January 09, 2004 9:31 AM > Subject: [pushlet] Digest Number 264 >=20 >> Date: Thu, 08 Jan 2004 11:44:30 +0100 >> From: Just van den Broecke >>Subject: Re: Q about scalability >> >>Maybe "hog resources" is a too negative connotation. It is simply a=20 >>fact that _any_ push technology where clients are permanently=20 >>connected through a TCP connection will require resources like sockets >>and threads, usually one pair per connection (except for=20 >>"select()"-type Sockets and in Java NIO). >> >>Pushlets run in a Java servlet engine (e.g. Tomcat). A servlet engine=20 >>is designed to handle short/stateless HTTP requests. Since most=20 >>browsers use HTTP/1.1, one or two permanent TCP socket connections are >>maintained per browser over which multiple HTTP requests are done. The >>servlet engine usually employs a "thread-per-request" strategy where a >>single thread is allocated for each request. Usually (like in Tomcat)=20 >>a thread is fetched from a thread-pool and returned after the request. >>Pushlets will maintain a permanent connection and thus will not return >>that Thread to the pool as long as the client remains connected. The=20 >>amount of threads in the pool is usually configurable (e.g. server.xml >>in >>Tomcat) even per HTTP connector/engine (Tomcat allows multiple engines). >>So a standard configuration (usually between 10-100 threads) will >>quickly exhaust the pool when more Pushlet connections than that amount >>are present (this refers to the "hog"). So if you are able to configure >>a server with a thread pool of 30,000 threads you may be able to reach >>that scale (if your server is also able to maintain 30000 sockets!). >>This is probably not the way to go. IMO linear scalability should be >>sought in deploying parallel servers. For example by using an Apache >>front-end master-server that will redirect requests to a random or >>round-robin pushlet server. >> >>Apart from this there are changes I plan to make to the Pushlet=20 >>framework in particular the protocol. There is currently a separate=20 >>servlet called the Pullet which ends the HTTP request after fetching=20 >>events from the per-client queue. I have used a variant of the Pullet=20 >>in a large scale project (www.rabotreasuryweb.com) to reach a much=20 >>higher scalability than with Pushlets. Between requests clients wait a >>random >>(configurable) amount of time. The wait-time is also dynamic dependent >>on the server-load. What I foresee is a merge between Pushlet and Pullet >>where different protocol-modes can be selected dependent on the >>application, the amount of events etc and that certain variables like >>the wait-time on the client and/or server can be adapted even >>dynamically. Note that the Pullet is not the same as a poll (refresh), >>the request is still held until an event becomes available. More >>ambitious would be a dedicated Pushlet server whose design is optimized >>for long-lived HTTP requests, possibly using Java Non-blocking IO (NIO) >>to preserve thread-usage. >> >>Maybe not the answer you expected. Scalability depends on more=20 >>variables than the ones you gave. For example the size/frequency of=20 >>events. >> >>best, >> >>Just >> >> >> >>Bryan Martinez wrote: >> >>>happy new year to you too just! >>> >>>in the documentation you mentioned that when 100-'s of clients are=20 >>>connected through pushlets they hog resources... what rating of a=20 >>>single server can confidently handle 10,000/20,000/30,000 pushlet >>>connections? like how many and what type of >>>processor(s) and amount of RAM needed to support it? >>>or is this simply impossible? |