You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(6) |
Nov
(4) |
Dec
(9) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
|
Feb
(14) |
Mar
(15) |
Apr
(9) |
May
(7) |
Jun
(21) |
Jul
(26) |
Aug
(2) |
Sep
(79) |
Oct
(49) |
Nov
(13) |
Dec
|
2013 |
Jan
(2) |
Feb
(7) |
Mar
(2) |
Apr
(13) |
May
(9) |
Jun
|
Jul
|
Aug
(1) |
Sep
(1) |
Oct
|
Nov
(9) |
Dec
|
2014 |
Jan
|
Feb
(6) |
Mar
|
Apr
|
May
(5) |
Jun
(1) |
Jul
|
Aug
(10) |
Sep
(3) |
Oct
(1) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
(1) |
Mar
(15) |
Apr
(3) |
May
|
Jun
(4) |
Jul
|
Aug
(16) |
Sep
|
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Gustaf N. <ne...@wu...> - 2015-03-20 07:58:10
|
> Am 20.03.15 um 07:48 schrieb Sep Ng: what is hurting you? > We have instances where we'd get a high number of concurrent users that the requests are getting queued, but when I look at the logs, there's a > lot of static files being served for each login page, let alone other pages being served in aolserver. So, I'm theorizing that being able to get those > static file requests pushed into a single thread and free up the connection threads would help in scalability. yes, there is a certain hope, that removing this burden from the connection threads will improve the situation. Another option to reduce queuing time is to increase the number of connection threads. If the bottleneck are slow sql-queries then this pooling stuff will not help. Often the first task to determine, what the bottleneck is, can be already be difficult. NaviServer has several introspection means for monitoring. The following graph shows queuing times, filter and run times (you won't get these numbers from aolserver). The graph (from OpenACS.org) shows that queuing time is on that site typically around 0.1 ms, with peaks in the range of 16 ms. This is for example quite useful for determining the right number of running connection threads. naviserver allows to change this number dynamically without restart.... weekly graph > By the way, I've seen in previous posts of yours that the you did switch from aolserver to naviserver. How big was the change? What things did > you have to re-write/port to get them running in naviserver? We did the move of our main site 4 years ago (now we have around 50 naviserver sites), but i do not have a detailed writeup of the changes. Most of our changes went into OpenACS (download OpenACS 5.8.1, search for NaviServer). what comes to my mind is: - NaviServer dropped the useless "$conn" argument from several commands (like old: "ns_return $conn 200 text/plain ..." -> "ns_return 200 text/plain ..." - different modules (e.g. for ssl), different config file - more functionality built-in which was as a module under aolserver crypo functions (sha, md5), cache, base-64 encoding, gzip delivery (actually, the "ns_cache" function in naviserver usues a single command style (ns_cache_eval) and in aolserver subcommand style, but we added already a compatibility layer to the naviserver source tree which is sufficient for OpenACS - no ns_share (use nsv instead) - no "ns_set -persistent" We did not use the latter two, but this comes sometimes up in the mailing lists. The move was quite easy for us, but ymmv. -g |
From: Gustaf N. <ne...@wu...> - 2015-03-20 06:33:49
|
Am 20.03.15 um 05:47 schrieb Sep Ng: > Hi Gustaf! Thank you for the informative response! > > I've been thinking of moving to NaviServer but I don't know enough > about the transition to make that call yet. Right now, we're on > aolserver and so, I'm trying to see what I can do on this platform. I > do not understand why the delivery doesn't work on https out of the > box and requires a reverse proxy. bgdelivery takes the socket (file descriptor) of the current connection, but it has no knowledge about SSL. When it hands the file descriptor to the background delivery thread, this can write back to the client just using plain tcl i/o. So, background delivery can certainly write to the file-descriptor, but that won't be accepted by the client trying to decrypt the channel. > > I suspect the varied client connection is part of the problem and them > sitting on the connection threads is hurting us. what is hurting you? > However, we do not serve big files on our server so this has me > wondering about the benefits of this change. whatever big means. connections can "hang" also when writing a few KBs. > > I'm not certain if aolserver has any facilities for asynchronous file > writing and spooling. the writer threads are an extension of naviserver over aolserver > It seems that I will have to build everything by hand. I had hoped > that simply transferring the thread and having it ns_returnfile would > be enough to get a simple form of background delivery going but it > doesn't look like that's the case. if your site requires https, one cant use bgdelivery without a reverse proxy. otherwise, everything is pre-packaged. -g > > Regards. > > > On Friday, March 20, 2015 at 12:03:52 PM UTC+8, Gustaf Neumann wrote: > > Dear Sep, > > The question whether it is worth to use asynchronous delivery boils > down to a question of usage pattern and desired scalability. > The general problem with serving (large) resources via > classical aolserver is that a connection thread is unable > to handle other threads for the time span of the delivery. > It is important to understand that the time span of the delivery > is mostly > determined by the client. A client with little processing power > connection > over e.g. a mobile phone can block a connection quite a long time. A > special instance of this is the slow-read attack [2], which is > a special denial-of-service attack. > > To serve e.g. 60 concurrent files one would require 60 > connection threads. Note that this can happen quite soon when > serving content with several included resources (images, css, js) > the first time to a client. When the server runs out of connection > threads, the requests are queued, which means that the > the user-perceived runtime of a request is actually queueing > time plus execution time. > > Background delivery (as described in [2]) is fully integrated in > OpenACS > addresses the problem by delegating output spooling (file deliveries) > to a single thread, which can deliver easily several 100 concurrent > downloads by using Tcl's asynchronous I/O operations. Note that > this works not only for static resources, but as well dynamic > requests (e.g. generating long HTML pages from e.g. a database). > We used this approach with very good success since 2006 > in large OpenACS installations (with e.g. 2000 simultaneous > active users; "simultaneous active" means here users who > requested pages within a time interval of 5 secs). > > In OpenACS, one can use simply ad_returnfile_background [3] > instead of ad_returnfile to make use of background delivery. > > The limitations of background delivery are that (a) it just works for > plain http, and (b) that it works for at most 1024 concurrently open > file handles. We addressed (a) by using a reverse proxy in front > of the server, which delivers the files from the backend via https. > The limitation (b) is harder, since it depends on Tcl's usage of the > select() > system call, which allows to wait for events for max. 1024 file > descriptors. Above this limit, it simply crashes. Lifting this limit > in systems like Linux is possible, but requires a privately compiled > libc and linux kernel. You might think, 1024 this is much more > one needs, but we were actually running close to this limit for > lecture casting (video streaming of university lectures). > > A better approach is to use NaviServer.'s c-level support. > NaviServer provides lightweight c-implemented > writer-threads using asynchronous I/O similar to > bg-delivery, but not using select(). The writer threads > works seemless with http and https. As with bgdelivery, a single > writer thread can serve a multitude of concurrent deliveries. > When several writer threads are defined, the load is split up > between these. NaviServer can also serve streaming > HTML (multiple ns_write commands) via writer threads. > It also support static and dynamic gzip deliveries see e.g. [3] > > When one uses OpenACS with NaviServer it will automatically use > writer-threads when configured. In reference [4] on can see the > difference in response time (actually the time duration spent > in connection threads) in NaviServer. OpenACS.org runs > on NaviServer since Sep 2014. A more detailed discussion > of these properties is in [5], all of this is part of NaviServer > 4.99.6. > > sorry for the longish reply, > -g > > [1] > http://openacs.org/xowiki/Boost_your_application_performance_to_serve_large_files > <http://openacs.org/xowiki/Boost_your_application_performance_to_serve_large_files>! > > [2] > http://en.wikipedia.org/wiki/Denial-of-service_attack#Slow_Read_attack > <http://en.wikipedia.org/wiki/Denial-of-service_attack#Slow_Read_attack> > > [3] > http://openacs.org/api-doc/proc-view?proc=ad_returnfile_background&source_p=1 > <http://openacs.org/api-doc/proc-view?proc=ad_returnfile_background&source_p=1> > > [3] http://www.qcode.co.uk/post/121 <http://www.qcode.co.uk/post/121> > [4] http://openacs.org/forums/message-view?message_id=4111406 > <http://openacs.org/forums/message-view?message_id=4111406> > [5] > https://next-scripting.org/xowiki/docs/misc/naviserver-connthreadqueue/index1 > <https://next-scripting.org/xowiki/docs/misc/naviserver-connthreadqueue/index1> > > > Am 19.03.15 um 07:09 schrieb Sep Ng: > > Hi all, > > > > I've been reading up on aolserver background delivery tricks on > > OpenACS and I've seen that the patches for the static TCL > channel is > > already in 4.5.1. In the spirit of improving server > performance, I've > > been wondering if such facility is worth building on the custom > app to > > increase concurrency and scalability. > > > > Most of the time, our aolserver also has to handle incoming > requests > > for multiple jpeg, javascript libraries, and a lot of other things. > > Freeing up the connection thread sounds very useful in > improving the > > server scalability so I wanted a little bit of help on getting > this to > > work. > > > > It's been hard trying to wrap my head around using ns_conn > channel and > > what I can actually do with this static TCL thread. It seems > that I > > should be redefining ns_returnfile to use background delivery. > Could > > I use it to push a TCL proc that generates given the parameters, > the > > dynamic page to this TCL channel to free up my connections? > > > > Sep > > ------------------------------------------------------------------------------ > > Dive into the World of Parallel Programming The Go Parallel > Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your > hub for all > things parallel software development, from weekly thought > leadership blogs to > news, videos, case studies, tutorials and more. Take a look and > join the > conversation now. http://goparallel.sourceforge.net/ > <http://goparallel.sourceforge.net/> > |
From: Gustaf N. <ne...@wu...> - 2015-03-20 04:03:41
|
Dear Sep, The question whether it is worth to use asynchronous delivery boils down to a question of usage pattern and desired scalability. The general problem with serving (large) resources via classical aolserver is that a connection thread is unable to handle other threads for the time span of the delivery. It is important to understand that the time span of the delivery is mostly determined by the client. A client with little processing power connection over e.g. a mobile phone can block a connection quite a long time. A special instance of this is the slow-read attack [2], which is a special denial-of-service attack. To serve e.g. 60 concurrent files one would require 60 connection threads. Note that this can happen quite soon when serving content with several included resources (images, css, js) the first time to a client. When the server runs out of connection threads, the requests are queued, which means that the the user-perceived runtime of a request is actually queueing time plus execution time. Background delivery (as described in [2]) is fully integrated in OpenACS addresses the problem by delegating output spooling (file deliveries) to a single thread, which can deliver easily several 100 concurrent downloads by using Tcl's asynchronous I/O operations. Note that this works not only for static resources, but as well dynamic requests (e.g. generating long HTML pages from e.g. a database). We used this approach with very good success since 2006 in large OpenACS installations (with e.g. 2000 simultaneous active users; "simultaneous active" means here users who requested pages within a time interval of 5 secs). In OpenACS, one can use simply ad_returnfile_background [3] instead of ad_returnfile to make use of background delivery. The limitations of background delivery are that (a) it just works for plain http, and (b) that it works for at most 1024 concurrently open file handles. We addressed (a) by using a reverse proxy in front of the server, which delivers the files from the backend via https. The limitation (b) is harder, since it depends on Tcl's usage of the select() system call, which allows to wait for events for max. 1024 file descriptors. Above this limit, it simply crashes. Lifting this limit in systems like Linux is possible, but requires a privately compiled libc and linux kernel. You might think, 1024 this is much more one needs, but we were actually running close to this limit for lecture casting (video streaming of university lectures). A better approach is to use NaviServer.'s c-level support. NaviServer provides lightweight c-implemented writer-threads using asynchronous I/O similar to bg-delivery, but not using select(). The writer threads works seemless with http and https. As with bgdelivery, a single writer thread can serve a multitude of concurrent deliveries. When several writer threads are defined, the load is split up between these. NaviServer can also serve streaming HTML (multiple ns_write commands) via writer threads. It also support static and dynamic gzip deliveries see e.g. [3] When one uses OpenACS with NaviServer it will automatically use writer-threads when configured. In reference [4] on can see the difference in response time (actually the time duration spent in connection threads) in NaviServer. OpenACS.org runs on NaviServer since Sep 2014. A more detailed discussion of these properties is in [5], all of this is part of NaviServer 4.99.6. sorry for the longish reply, -g [1] http://openacs.org/xowiki/Boost_your_application_performance_to_serve_large_files! [2] http://en.wikipedia.org/wiki/Denial-of-service_attack#Slow_Read_attack [3] http://openacs.org/api-doc/proc-view?proc=ad_returnfile_background&source_p=1 [3] http://www.qcode.co.uk/post/121 [4] http://openacs.org/forums/message-view?message_id=4111406 [5] https://next-scripting.org/xowiki/docs/misc/naviserver-connthreadqueue/index1 Am 19.03.15 um 07:09 schrieb Sep Ng: > Hi all, > > I've been reading up on aolserver background delivery tricks on > OpenACS and I've seen that the patches for the static TCL channel is > already in 4.5.1. In the spirit of improving server performance, I've > been wondering if such facility is worth building on the custom app to > increase concurrency and scalability. > > Most of the time, our aolserver also has to handle incoming requests > for multiple jpeg, javascript libraries, and a lot of other things. > Freeing up the connection thread sounds very useful in improving the > server scalability so I wanted a little bit of help on getting this to > work. > > It's been hard trying to wrap my head around using ns_conn channel and > what I can actually do with this static TCL thread. It seems that I > should be redefining ns_returnfile to use background delivery. Could > I use it to push a TCL proc that generates given the parameters, the > dynamic page to this TCL channel to free up my connections? > > Sep |
From: Alex H. <ah...@am...> - 2015-03-20 02:27:17
|
Hi, Sep. If you don’t need to run any tcl code when serving these requests and you still want to serve them using AOLserver (as opposed to some other web server or a CDN), you can take advantage of pools to segregate threads for serving these static resources into their own pool. If those threads never allocate a tcl interpreter, they will be an order of magnitude smaller in footprint than your normal threads and so you can just have a lot of them. We have a module that you might find useful that helps with this: http://aolserver.am.net/code/modules/ampools.adpx -Alex Hisen From: Sep Ng [mailto:the...@gm...] Sent: Thursday, March 19, 2015 5:52 PM To: aol...@go... Cc: aol...@li... Subject: Re: [AOLSERVER] AOLserver questions Generally, what I hope to achieve is that all of these static files will be offloaded into a single connection thread(?) So, when a request for a static file comes in, I can push it to this sleeping thread and then serve another request while this sleeping thread will look up the image and do ns_returnfile (I guess). At least, this is how I'm envisioning it right now. I don't know if I'm looking at it right or not. Judging from my readings on Gustaf's work, this is how it would operate. Feel free to correct me if I'm looking at this totally wrong. On Friday, March 20, 2015 at 8:42:04 AM UTC+8, Tony Bennett (Brown Paper Tickets) wrote: Scheduling isn't needed. I read your question again and I understand what you're looking for. You're asking for all the javascript and images on a page to be sent in one request correct? You'll need to find a way to buffer the output and then parse and change the buffer before it's sent. It would be nice to have this be part of ns_register_filter postauth. On 3/19/15 5:12 PM, Sep Ng wrote: Thanks for the reply. I am perhaps confused with all of this. It seems that if I use the scheduling proc, I can start a thread that runs perpetually and does nothing. Then, I can use tclthread API to transfer control into this and issue some proc that would perform mutex and serve the file to the current ns_conn details and quit. Am I thinking this right or am I being stupid? :-) On Friday, March 20, 2015 at 5:09:20 AM UTC+8, Tony Bennett (Brown Paper Tickets) wrote: Look at the scheduling commands at http://panoptic.com/wiki/aolserver/Tcl_API. You could make an image processing queue that runs in it's own thread and it won't take up any connections. Tony On 3/18/15 11:09 PM, Sep Ng wrote: Hi all, I've been reading up on aolserver background delivery tricks on OpenACS and I've seen that the patches for the static TCL channel is already in 4.5.1. In the spirit of improving server performance, I've been wondering if such facility is worth building on the custom app to increase concurrency and scalability. Most of the time, our aolserver also has to handle incoming requests for multiple jpeg, javascript libraries, and a lot of other things. Freeing up the connection thread sounds very useful in improving the server scalability so I wanted a little bit of help on getting this to work. It's been hard trying to wrap my head around using ns_conn channel and what I can actually do with this static TCL thread. It seems that I should be redefining ns_returnfile to use background delivery. Could I use it to push a TCL proc that generates given the parameters, the dynamic page to this TCL channel to free up my connections? Sep ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ aolserver-talk mailing list aolserv...@lists.sourceforge.net<mailto:aolserv...@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/aolserver-talk |
From: Tony B. (B. P. Tickets) <to...@br...> - 2015-03-20 00:41:59
|
Scheduling isn't needed. I read your question again and I understand what you're looking for. You're asking for all the javascript and images on a page to be sent in one request correct? You'll need to find a way to buffer the output and then parse and change the buffer before it's sent. It would be nice to have this be part of ns_register_filter postauth. On 3/19/15 5:12 PM, Sep Ng wrote: > Thanks for the reply. I am perhaps confused with all of this. It > seems that if I use the scheduling proc, I can start a thread that > runs perpetually and does nothing. Then, I can use tclthread API to > transfer control into this and issue some proc that would perform > mutex and serve the file to the current ns_conn details and quit. Am > I thinking this right or am I being stupid? :-) > > On Friday, March 20, 2015 at 5:09:20 AM UTC+8, Tony Bennett (Brown > Paper Tickets) wrote: > > Look at the scheduling commands at > http://panoptic.com/wiki/aolserver/Tcl_API > <http://panoptic.com/wiki/aolserver/Tcl_API>. You could make an > image processing queue that runs in it's own thread and it won't > take up any connections. > > Tony > > On 3/18/15 11:09 PM, Sep Ng wrote: >> Hi all, >> >> I've been reading up on aolserver background delivery tricks on >> OpenACS and I've seen that the patches for the static TCL channel >> is already in 4.5.1. In the spirit of improving server >> performance, I've been wondering if such facility is worth >> building on the custom app to increase concurrency and scalability. >> >> Most of the time, our aolserver also has to handle incoming >> requests for multiple jpeg, javascript libraries, and a lot of >> other things. Freeing up the connection thread sounds very >> useful in improving the server scalability so I wanted a little >> bit of help on getting this to work. >> >> It's been hard trying to wrap my head around using ns_conn >> channel and what I can actually do with this static TCL thread. >> It seems that I should be redefining ns_returnfile to use >> background delivery. Could I use it to push a TCL proc that >> generates given the parameters, the dynamic page to this TCL >> channel to free up my connections? >> >> Sep >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub for all >> things parallel software development, from weekly thought leadership blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ <http://goparallel.sourceforge.net/> >> >> >> _______________________________________________ >> aolserver-talk mailing list >> aolserv...@lists.sourceforge.net <javascript:> >> https://lists.sourceforge.net/lists/listinfo/aolserver-talk <https://lists.sourceforge.net/lists/listinfo/aolserver-talk> > |
From: Tony B. (B. P. Tickets) <to...@br...> - 2015-03-19 21:09:08
|
Look at the scheduling commands at http://panoptic.com/wiki/aolserver/Tcl_API. You could make an image processing queue that runs in it's own thread and it won't take up any connections. Tony On 3/18/15 11:09 PM, Sep Ng wrote: > Hi all, > > I've been reading up on aolserver background delivery tricks on > OpenACS and I've seen that the patches for the static TCL channel is > already in 4.5.1. In the spirit of improving server performance, I've > been wondering if such facility is worth building on the custom app to > increase concurrency and scalability. > > Most of the time, our aolserver also has to handle incoming requests > for multiple jpeg, javascript libraries, and a lot of other things. > Freeing up the connection thread sounds very useful in improving the > server scalability so I wanted a little bit of help on getting this to > work. > > It's been hard trying to wrap my head around using ns_conn channel and > what I can actually do with this static TCL thread. It seems that I > should be redefining ns_returnfile to use background delivery. Could > I use it to push a TCL proc that generates given the parameters, the > dynamic page to this TCL channel to free up my connections? > > Sep > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > > > _______________________________________________ > aolserver-talk mailing list > aol...@li... > https://lists.sourceforge.net/lists/listinfo/aolserver-talk |
From: Andrew P. <at...@pi...> - 2014-10-04 07:10:58
|
On Fri, Oct 03, 2014 at 05:20:48PM +0200, Maurizio Martignano wrote: > Subject: Re: Naviserver hangs on Windows > The Windows-OpenACS distribution which I make available here > (http://www.spazioit.com/pages_en/sol_inf_en/windows-openacs_en/) is based > on AOLServer 4.5.2, contains the sources, is compiled with Visual Studio > 2013 and runs on Windows 64. Indeed, the first thing I noticed is that Maurizio's Windows build seems to use nsconfig.tcl in place of the Unix configure/autoconf, which Jim Davidson added back in 2005. Naviserver folks, can you comment on what you think of that approach? Does Naviserver not include it solely because it wasn't included in the AOLserver 4.0.10 branch that Naviserver forked from? https://bitbucket.org/aolserver/aolserver/src/2aa0f24395ae0a42fd590f3635e4fdf2c7eefdfd/nsconfig.tcl?at=default andy@milo:/home/nobackup/co/nsd-aol/aolserver-head-hg$ hg log nsconfig.tcl configure.tcl changeset: 1370:76895d8bf843 user: Jim Davidson <jgd...@ao...> date: Thu Aug 18 21:48:21 2005 +0100 summary: Renamed configure script configure.tcl changeset: 1367:9c88ea73dcad user: Jim Davidson <jgd...@ao...> date: Wed Aug 17 23:55:57 2005 +0100 summary: Updates to new build tools to support Unix changeset: 1360:697679717350 user: Jim Davidson <jgd...@ao...> date: Wed Aug 17 22:18:46 2005 +0100 summary: New platform-indepedent build support -- Andrew Piskorski <at...@pi...> |
From: Brad C. <br...@ch...> - 2014-09-05 21:04:56
|
Brilliant. Thanks much On 9/5/14, 3:41 PM, Mat Kovach wrote: > Brad Chick writes: >> On the old AOLserver, under modules, there is nssession. But I can't >> seem to get it. cvs seems not to be working: >> >> [brad@repos ~]$ cvs -d :pserver:ano...@cv...:/home/cvs co >> -d nssession aolserver/nssession >> cvs [checkout aborted]: connect to [cvs.panoptic.com]:2401 failed: >> Protocol not available >> >> Does anyone have that module lying around that you could send along? > $ cvs -z3 -d:pserver:ano...@ao...:/cvsroot/aolserver co -P nssession > cvs checkout: Updating nssession > U nssession/Makefile > U nssession/Makefile.nssession > U nssession/Makefile.nssessionmanager_file > U nssession/README > U nssession/config.tcl > U nssession/nshash.c > U nssession/nshash.h > U nssession/nssession.c > U nssession/nssession.h > U nssession/nssessionmanager_file.c > cvs checkout: Updating nssession/examples > U nssession/examples/counter.adp > U nssession/examples/guess.adp > > http://sourceforge.net/projects/aolserver/ > > At least, that is where I still tend to yank my code from. > > /mek -- ============================== BRAD CHICK ============================== Brad@ChickCentral.com 734.662.1701 (h) 734.646.9372 (m) "Make Some Time for Wasting!" _ | | ___| |__ ___ ___ _ __ ___ / __| '_ \ / _ \/ _ \ '__/ __| (__| | | | __/ __/ | \__ \ \___|_| |_|\___|\___|_| |___/ ================================ |
From: Jeff R. <dv...@di...> - 2014-09-05 19:37:04
|
Brad Chick wrote: > On the old AOLserver, under modules, there is nssession. But I can't > seem to get it. cvs seems not to be working: > > [brad@repos ~]$ cvs -d :pserver:ano...@cv...:/home/cvs co > -d nssession aolserver/nssession > cvs [checkout aborted]: connect to [cvs.panoptic.com]:2401 failed: > Protocol not available > > Does anyone have that module lying around that you could send along? Try aolserver.cvs.sourceforge.net rather than cvs.panoptic.com - Dossy might have finally gotten rid of his copies of old cvs stuff. -J |
From: Brad C. <br...@ch...> - 2014-09-05 18:49:01
|
On the old AOLserver, under modules, there is nssession. But I can't seem to get it. cvs seems not to be working: [brad@repos ~]$ cvs -d :pserver:ano...@cv...:/home/cvs co -d nssession aolserver/nssession cvs [checkout aborted]: connect to [cvs.panoptic.com]:2401 failed: Protocol not available Does anyone have that module lying around that you could send along? Thanks -- ============================== BRAD CHICK ============================== Brad@ChickCentral.com 734.662.1701 (h) 734.646.9372 (m) "Make Some Time for Wasting!" _ | | ___| |__ ___ ___ _ __ ___ / __| '_ \ / _ \/ _ \ '__/ __| (__| | | | __/ __/ | \__ \ \___|_| |_|\___|\___|_| |___/ ================================ |
From: Jeff R. <dv...@di...> - 2014-08-28 21:41:39
|
Do you happen to have a custom error page for 503 errors (service unavailable)? I'm looking over the code for places where a connection might be dropped without sending anything, and although this path doesn't seem to do that it does look possible for it to do other bad things (i.e., crash). It could correlate with a quick burst of requests from a single user (loading all the images for a page, for example). I'd expect an outright crash if this was the case, which would probably be more noticeable. However, crashes can be funny things and there's a chance that it could do something entirely unexpected. -J Cyan ogilvie wrote: > On Wed, Aug 27, 2014 at 9:34 PM, Torben Brosten <to...@de... > There is usually a cluster of up to about 6 requests that fail in this > way for a given server in the space of around 2 seconds, with long > intervals between the clusters where no failures happen. |
From: Gustaf N. <ne...@wu...> - 2014-08-28 14:30:23
|
Dear Cyan, you seem to have invested in detail into the problem. We are using aolserver and naviserver since many years with heavy traffic (since two last years just naviserver), but we have not seen this problem. Some question pop up: - you say, you have a legacy application: did the problem show up just recently or did you become aware of this just recently? - can you reproduce the problem on are bare metal machine? - do you have the option to replace haproxy by nginx in front of aolserver? nginx can use for the intratalk persistent connections (we could reduce the number of connections to naviserver by some order of magnitude), nginx has various retry options. maybe haproxy has the same, i have not worked with this. -gustaf neumann Am 28.08.14 13:51, schrieb Cyan ogilvie: > On Wed, Aug 27, 2014 at 9:34 PM, Torben Brosten <to...@de... > <mailto:to...@de...>> wrote: > > > > On 08/27/2014 12:02 PM, Cyan ogilvie wrote:> .. > > > There doesn't seem to be a pattern to the failing requests, > > > sometimes it's small static files like favicon.ico, but mostly not > > > (although in> our case we're not using fastpath for that - different > > favicons are > > > served based on the request context). At the moment I'm leaning > > > towards some sort of corrupted connection thread state - the failures > > > tend to cluster by time, server, user - so that, although the > failures > > > are exceedingly rare overall (220 yesterday), it's often the case > that > > > a given user will have to reload a page several times before they get > > > a successful response. The servers are fronted by haproxy which will > > > tend to send a given session back to the same server. > > > .. > > > > Have you ruled out a router issue, such as from ipv4 exhaustion or > > localized network flooding? > > I'm pretty sure it's not network related at this stage. To test this > I built a man-in-the-middle relay listening on port 8008 running on > the same server as nsd, which forwards all connection traffic to > 127.0.0.1:80 <http://127.0.0.1:80> and records the events and data it > sees flowing in both directions, and a packet trace using tcpdump of > both the requests arriving on eth0 and the relayed traffic on lo. An > iptables nat prerouting rule DNATs connections coming in on eth0 port > 80 to the relay's port 8008. Another process watches the haproxy logs > for indications of a failed request and retrieves a dump of the ring > buffers from the relay and saves them for later analysis. > > Here is the packet trace of the relay -> nsd traffic on the lo > interface for a typical event (as captured by tcpdump, which indicated > that no packets had been dropped): > > Time Prot Len Info > 34.750440 TCP 76 47576 > http [SYN] Seq=0 Win=32792 Len=0 MSS=16396 > SACK_PERM=1 WS=32 > 34.750465 TCP 76 http > 47576 [SYN, ACK] Seq=0 Ack=1 Win=32768 > Len=0 MSS=16396 SACK_PERM=1 WS=32 > 34.750479 TCP 68 47576 > http [ACK] Seq=1 Ack=1 Win=32800 Len=0 > 34.750720 TCP 4412 [TCP segment of a reassembled PDU] > 34.750756 TCP 68 http > 47576 [ACK] Seq=1 Ack=4345 Win=32768 Len=0 > 34.751274 HTTP 439 GET > /item/531138-RR-1012/Elgin-Art-Deco-Dial-Pocket-Watch HTTP/1.1 > 34.751295 TCP 68 http > 47576 [ACK] Seq=1 Ack=4716 Win=32768 Len=0 > 34.751377 TCP 68 http > 47576 [FIN, ACK] Seq=1 Ack=4716 Win=32768 Len=0 > 34.751492 TCP 68 47576 > http [FIN, ACK] Seq=4716 Ack=2 Win=32800 Len=0 > 34.751515 TCP 68 http > 47576 [ACK] Seq=2 Ack=4717 Win=32768 Len=0 > > The connection reaches the ESTABLISHED state and the HTTP request data > is sent to nsd which is acked. Then 0.8 ms later the connection is > closed by nsd. The first Tcl code that should execute for this > request is a preauth filter, which starts by writing a log containing > [ns_conn request]. That log message isn't reached in cases like this. > In this example the request is quite large (around 4.7 KB) because of > some large cookies, but the same pattern happens for requests of > around 400 bytes. > > There appears to be a less common failure mode where the request > processing happens normally and the Tcl code generates a normal > response (HTTP code 200), but the response data never hits the > network. The network trace looks the same as the example I gave above > except that the time between the ACK of the GET request and nsd's FIN, > ACK is longer - around 20 - 70 ms which is in-line with the normal > times for a successful request. I haven't yet caught the packet trace > between the relay and nsd for this case (only the trace on eth0 > between haproxy and the relay), so I'm not 100% certain of my > interpretation of this failure mode yet. > > There is usually a cluster of up to about 6 requests that fail in this > way for a given server in the space of around 2 seconds, with long > intervals between the clusters where no failures happen. > > Cyan > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > > > _______________________________________________ > aolserver-talk mailing list > aol...@li... > https://lists.sourceforge.net/lists/listinfo/aolserver-talk -- Univ.Prof. Dr. Gustaf Neumann WU Vienna Institute of Information Systems and New Media Welthandelsplatz 1, A-1020 Vienna, Austria |
From: Cyan o. <cy...@ru...> - 2014-08-28 11:51:18
|
On Wed, Aug 27, 2014 at 9:34 PM, Torben Brosten <to...@de...> wrote: > > On 08/27/2014 12:02 PM, Cyan ogilvie wrote:> .. > > There doesn't seem to be a pattern to the failing requests, > > sometimes it's small static files like favicon.ico, but mostly not > > (although in> our case we're not using fastpath for that - different > favicons are > > served based on the request context). At the moment I'm leaning > > towards some sort of corrupted connection thread state - the failures > > tend to cluster by time, server, user - so that, although the failures > > are exceedingly rare overall (220 yesterday), it's often the case that > > a given user will have to reload a page several times before they get > > a successful response. The servers are fronted by haproxy which will > > tend to send a given session back to the same server. > > .. > > Have you ruled out a router issue, such as from ipv4 exhaustion or > localized network flooding? I'm pretty sure it's not network related at this stage. To test this I built a man-in-the-middle relay listening on port 8008 running on the same server as nsd, which forwards all connection traffic to 127.0.0.1:80 and records the events and data it sees flowing in both directions, and a packet trace using tcpdump of both the requests arriving on eth0 and the relayed traffic on lo. An iptables nat prerouting rule DNATs connections coming in on eth0 port 80 to the relay's port 8008. Another process watches the haproxy logs for indications of a failed request and retrieves a dump of the ring buffers from the relay and saves them for later analysis. Here is the packet trace of the relay -> nsd traffic on the lo interface for a typical event (as captured by tcpdump, which indicated that no packets had been dropped): Time Prot Len Info 34.750440 TCP 76 47576 > http [SYN] Seq=0 Win=32792 Len=0 MSS=16396 SACK_PERM=1 WS=32 34.750465 TCP 76 http > 47576 [SYN, ACK] Seq=0 Ack=1 Win=32768 Len=0 MSS=16396 SACK_PERM=1 WS=32 34.750479 TCP 68 47576 > http [ACK] Seq=1 Ack=1 Win=32800 Len=0 34.750720 TCP 4412 [TCP segment of a reassembled PDU] 34.750756 TCP 68 http > 47576 [ACK] Seq=1 Ack=4345 Win=32768 Len=0 34.751274 HTTP 439 GET /item/531138-RR-1012/Elgin-Art-Deco-Dial-Pocket-Watch HTTP/1.1 34.751295 TCP 68 http > 47576 [ACK] Seq=1 Ack=4716 Win=32768 Len=0 34.751377 TCP 68 http > 47576 [FIN, ACK] Seq=1 Ack=4716 Win=32768 Len=0 34.751492 TCP 68 47576 > http [FIN, ACK] Seq=4716 Ack=2 Win=32800 Len=0 34.751515 TCP 68 http > 47576 [ACK] Seq=2 Ack=4717 Win=32768 Len=0 The connection reaches the ESTABLISHED state and the HTTP request data is sent to nsd which is acked. Then 0.8 ms later the connection is closed by nsd. The first Tcl code that should execute for this request is a preauth filter, which starts by writing a log containing [ns_conn request]. That log message isn't reached in cases like this. In this example the request is quite large (around 4.7 KB) because of some large cookies, but the same pattern happens for requests of around 400 bytes. There appears to be a less common failure mode where the request processing happens normally and the Tcl code generates a normal response (HTTP code 200), but the response data never hits the network. The network trace looks the same as the example I gave above except that the time between the ACK of the GET request and nsd's FIN, ACK is longer - around 20 - 70 ms which is in-line with the normal times for a successful request. I haven't yet caught the packet trace between the relay and nsd for this case (only the trace on eth0 between haproxy and the relay), so I'm not 100% certain of my interpretation of this failure mode yet. There is usually a cluster of up to about 6 requests that fail in this way for a given server in the space of around 2 seconds, with long intervals between the clusters where no failures happen. Cyan |
From: Torben B. <to...@de...> - 2014-08-27 19:54:16
|
On 08/27/2014 12:02 PM, Cyan ogilvie wrote:> .. > There doesn't seem to be a pattern to the failing requests, > sometimes it's small static files like favicon.ico, but mostly not > (although in> our case we're not using fastpath for that - different favicons are > served based on the request context). At the moment I'm leaning > towards some sort of corrupted connection thread state - the failures > tend to cluster by time, server, user - so that, although the failures > are exceedingly rare overall (220 yesterday), it's often the case that > a given user will have to reload a page several times before they get > a successful response. The servers are fronted by haproxy which will > tend to send a given session back to the same server. > .. Have you ruled out a router issue, such as from ipv4 exhaustion or localized network flooding? |
From: Cyan o. <cy...@ru...> - 2014-08-27 19:30:54
|
On Tue, Aug 26, 2014 at 11:05 PM, Tony Bennett (Brown Paper Tickets) <to...@br...> wrote: > Does this only happen under heavy load? You might be able to adjust your > init.tcl to handle more connections. I'm not sure what you would need to > change so I pulled out any config that might help. Thanks, the current config has been been minimally modified from what was being used for 3.4, so there are a few settings in your example that we're not explicitly setting and that I'm not familiar with, so they're good leads to follow. Cyan |
From: Cyan o. <cy...@ru...> - 2014-08-27 19:02:15
|
On Wed, Aug 27, 2014 at 12:29 AM, Jeff Rogers <dv...@di...> wrote: > > Do you know if this connection dropping happens mostly when there is a lot of activity or more frequently when there is very low activity? > > I recall a few edge cases in the thread pooling where a thread would in some circumstances wait until another connection came in before running, and there might have been a related case where a connection could get dropped. IIRC, these both happened generally when there was low traffic (or more specifically low concurrent traffic). Playing with maxconns might diminish the problem in this case. There doesn't seem to be a correlation between the site load and the connection drops. We never really see very low activity - the design of the site means there are hundreds of thousands of pages, so we're constantly being crawled by every bot ever spawned (we serve about 8GB daily to googlebot alone). But we use fairly small Amazon EC2 instances that are primarily memory limited, so the config is tuned to reach a memory high water mark of around 700 - 800 MB, which based on the load testing I did found maxthreads of 6 and maxconnections of 50 (modules/tcl/pools.tcl seems to take the maxconnections ns_param as the ns_pools -maxconns parameter). The project was most recently moved from version 3.4 / Tcl 7.6 to 4.5 / Tcl 8.6, so there are a lot of legacy bits and shims in place. I recall something about the meaning of the maxconnections ns_param changing meaning between these versions, but it seems to be working as I would expect). Server load tends to vary between a loadavg of 0.4 to 2.0, typically around 0.6 (2 cores). Concurrency is pretty low, it seldom reaches 6, and usually sits at around 1 - 3, based on the monitoring we've currently got in place. I'm currently trying to reconstruct exact the thread / concurrency / request context for the connection drop events by parsing the log files, I'm hoping that might reveal some pattern in the failures. But so far they don't seem to correspond with connection lifecycle events. > You also mention favicon.ico; is it mostly or always that? It's notable for being a small static file, which could point to other causes, like a corrupt interpreter state as Peter suggested. Or there might be some weirdness with mmap if you have that enabled. There doesn't seem to be a pattern to the failing requests, sometimes it's small static files like favicon.ico, but mostly not (although in our case we're not using fastpath for that - different favicons are served based on the request context). At the moment I'm leaning towards some sort of corrupted connection thread state - the failures tend to cluster by time, server, user - so that, although the failures are exceedingly rare overall (220 yesterday), it's often the case that a given user will have to reload a page several times before they get a successful response. The servers are fronted by haproxy which will tend to send a given session back to the same server. > One other thought, can you switch to naviserver? The connection handling there has evolved somewhat differently not to mention more recently) than aolserver, but programming-wise there are not a lot of differences. It's probably not out of the question if there is a strong argument to be made that it would fix the problem, we're taking quite a reputation hit at the moment. I initially attempted to make the site work on naviserver since that seemed to be more active, but I ran into problems with the nsdb / nsdbi change and segfaults when I tried to get nsdb working on it. It's also a 15 year old code base that seemed to be quite sensitive to the small config and api changes in naviserver, and the port from 3.4 / Tcl 7.6 was tricky enough as it was (encoding issues, list parsing differences, regexp syntax, etc) that the call was made to go with AOLserver 4.5 instead to minimize the changes required. But the site has been running well on the new version since March / April, so porting to naviserver should be feasible, but I'd need to make a very strong case. Cyan |
From: Jeff R. <dv...@di...> - 2014-08-26 22:51:31
|
Hi Cyan, Yes, there's still a few subscribers. Do you know if this connection dropping happens mostly when there is a lot of activity or more frequently when there is very low activity? I recall a few edge cases in the thread pooling where a thread would in some circumstances wait until another connection came in before running, and there might have been a related case where a connection could get dropped. IIRC, these both happened generally when there was low traffic (or more specifically low concurrent traffic). Playing with maxconns might diminish the problem in this case. You also mention favicon.ico; is it mostly or always that? It's notable for being a small static file, which could point to other causes, like a corrupt interpreter state as Peter suggested. Or there might be some weirdness with mmap if you have that enabled. One other thought, can you switch to naviserver? The connection handling there has evolved somewhat differently not to mention more recently) than aolserver, but programming-wise there are not a lot of differences. -J Cyan ogilvie wrote: > Hi > > I'm hoping there are still some subscribers to this list ;) > > I'm trying to debug a strange condition we're seeing on a small > percentage of our connections: connections are being closed by the > server without any response being sent back on the connection (verified > by looking at network packet traces and inserting a logging transparent > proxy between the client and server). The network packet pattern we see is: > > <normal TCP setup - SYN, SYN/ACK, ACK> > > Request data (in a single frame, or multiple), ACK > > Then the connection is closed by the server after 10 - 70 ms, without > any data being sent, with a FIN/ACK (still getting confirmation on this > - these logs are from the other side of the man-in-the-middle proxy I'm > using to get debugging info). > > For some of the failed requests, the server processing never gets as far > as the start of our Tcl code (a preauth filter that starts with an > ns_log that doesn't show up in the server log). > > For others the request is processed normally and an access.log message > written indicating that a response was generated with HTTP code 200, but > no packet shows up on the network. > > There is no pattern to the failed requests (sometimes requests for > favicon.ico fail), and retrying the exact request shortly afterwards > often succeeds. > > Has anyone seen anything like this before, or have any advice on how to > narrow down the cause further? > > We're running a slightly patched version of the last 4.5.2 rc, on Ubuntu > 12.04.5 64bit on Amazon EC2 instances, with Tcl 8.6.1 > > Thanks > > Cyan > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > > > > _______________________________________________ > aolserver-talk mailing list > aol...@li... > https://lists.sourceforge.net/lists/listinfo/aolserver-talk > |
From: Tony B. (B. P. Tickets) <to...@br...> - 2014-08-26 21:29:06
|
Does this only happen under heavy load? You might be able to adjust your init.tcl to handle more connections. I'm not sure what you would need to change so I pulled out any config that might help. -Tony ns_section "ns/parameters" ns_param listenbacklog 32 ;# Max length of pending conn queue # keepalivetimeout # In prior versions of AOLserver this may have been under # ns/servers/$server. Set to 0 to disable Keep-alive support, any other # value will enable Keep-alive but won't actually configure the timeout, # which is now configured under the driver. # (http://aolserver.am.net/docs/tuning.adpx) ns_param keepalivetimeout 30 ;# Max time conn is kept alive (keepalive) # maxkeepalive # In prior versions of AOLserver this may have been under # ns/servers/$server. In AOLserver 4.0 and newer, this setting exists # but appears to be non-operational # (http://aolserver.am.net/docs/tuning.adpx) ns_param maxkeepalive 100 ;# Max no. of conns in keepalive state ns_section "ns/server/${server_name}" ns_limits set default -maxrun 500 ;# the maximum number of connections which can be running simultaneously ### Tuning Options ### ns_param connsperthread 0 ;# Normally there's one conn per thread #ns_param flushcontent false ;# Flush all data before returning ns_param maxconnections 100 ;# Max connections per connection thread before it is shut down #ns_param maxdropped 0 ;# Shut down if dropping too many conns ns_param maxthreads 20 ;# Tune this to scale your server ns_param minthreads 0 ;# Tune this to scale your server ns_param threadtimeout 120 ;# Idle timeout for connection threads ns_param spread 20 ;# Variance factor for threadtimeout and maxconnections to prevent mass mortality of theads (e.g. +-20%) ns_section "ns/server/${server_name}/module/nssock" ns_param Address $address ns_param Hostname $host ns_param Port $port ns_param maxsock 500 ns_param keepwait 30 ;# number of seconds to hang-up on clients while waiting for connection ns_param socktimeout 30 ;# number of seconds to wait for a client request. increase for file upload ns_param maxinput 1MB ;# maximum size of data sent from browser ns_param maxline 16k ;# maximum number of bytes for http request or header line. ns_param maxheader 64k ;# maximum number of bytes for all HTTP header lines in a request ######################################################################## # Thread library (nsthread) parameters # # If the server is crashing with no explanation you may have a corrupted # data due to a stack overflow. Calculate locally declared data types # and function parameters to get the stack size needed. ######################################################################## ns_section "ns/threads" #ns_param mutexmeter true ;# measure lock contention #ns_param stacksize [expr 128*1024] ;# stack size per thread (in bytes) On 8/26/14, 9:05 AM, Cyan ogilvie wrote: > Hi > > I'm hoping there are still some subscribers to this list ;) > > I'm trying to debug a strange condition we're seeing on a small > percentage of our connections: connections are being closed by the > server without any response being sent back on the connection > (verified by looking at network packet traces and inserting a logging > transparent proxy between the client and server). The network packet > pattern we see is: > > <normal TCP setup - SYN, SYN/ACK, ACK> > > Request data (in a single frame, or multiple), ACK > > Then the connection is closed by the server after 10 - 70 ms, without > any data being sent, with a FIN/ACK (still getting confirmation on > this - these logs are from the other side of the man-in-the-middle > proxy I'm using to get debugging info). > > For some of the failed requests, the server processing never gets as > far as the start of our Tcl code (a preauth filter that starts with an > ns_log that doesn't show up in the server log). > > For others the request is processed normally and an access.log message > written indicating that a response was generated with HTTP code 200, > but no packet shows up on the network. > > There is no pattern to the failed requests (sometimes requests for > favicon.ico fail), and retrying the exact request shortly afterwards > often succeeds. > > Has anyone seen anything like this before, or have any advice on how > to narrow down the cause further? > > We're running a slightly patched version of the last 4.5.2 rc, on > Ubuntu 12.04.5 64bit on Amazon EC2 instances, with Tcl 8.6.1 > > Thanks > > Cyan > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > > > _______________________________________________ > aolserver-talk mailing list > aol...@li... > https://lists.sourceforge.net/lists/listinfo/aolserver-talk |
From: Peter S. <f_p...@ho...> - 2014-08-26 19:32:55
|
Not sure if this will help, but.... If you have a preauth filter (possibly other filters too) which does an ns_adp_abort (and/or possibly ns_adp_break) within the filter or within a proc called from the filter then the next time that thread is used, the state will be "abort" and will end the connection. You can put this line as the first lines of your pre_auth filter: ns_adp_exception statens_log notice $state If you find states of "abort" in your log, do this as a simple hack to reset the state: ns_adp_exception stateif {$state=="abort" || $state=="break"} { catch {ns_adp_return} set foobar [ns_adp_parse "Hello"]} It may be a place to start looking, but if you have a ns_log in your preauth which is not logging then I would guess the issue is somewhat different. But anyone who may use an ns_adp_abort or ns_adp_break from within a filter should be aware of this. I believe using ns_adp_abort in a filter before version 4 worked fine, since 4.0 or 4.5 it causes this abort state issue, I don't remember exactly, it was a long time ago that I had upgraded and had to figure that out. _Peter Date: Tue, 26 Aug 2014 18:05:31 +0200 From: cy...@ru... To: aol...@li... Subject: [AOLSERVER] Null responses Hi I'm hoping there are still some subscribers to this list ;) I'm trying to debug a strange condition we're seeing on a small percentage of our connections: connections are being closed by the server without any response being sent back on the connection (verified by looking at network packet traces and inserting a logging transparent proxy between the client and server). The network packet pattern we see is: <normal TCP setup - SYN, SYN/ACK, ACK> Request data (in a single frame, or multiple), ACK Then the connection is closed by the server after 10 - 70 ms, without any data being sent, with a FIN/ACK (still getting confirmation on this - these logs are from the other side of the man-in-the-middle proxy I'm using to get debugging info). For some of the failed requests, the server processing never gets as far as the start of our Tcl code (a preauth filter that starts with an ns_log that doesn't show up in the server log). For others the request is processed normally and an access.log message written indicating that a response was generated with HTTP code 200, but no packet shows up on the network. There is no pattern to the failed requests (sometimes requests for favicon.ico fail), and retrying the exact request shortly afterwards often succeeds. Has anyone seen anything like this before, or have any advice on how to narrow down the cause further? We're running a slightly patched version of the last 4.5.2 rc, on Ubuntu 12.04.5 64bit on Amazon EC2 instances, with Tcl 8.6.1 Thanks Cyan ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ aolserver-talk mailing list aol...@li... https://lists.sourceforge.net/lists/listinfo/aolserver-talk |
From: Cyan o. <cy...@ru...> - 2014-08-26 16:21:50
|
Hi I'm hoping there are still some subscribers to this list ;) I'm trying to debug a strange condition we're seeing on a small percentage of our connections: connections are being closed by the server without any response being sent back on the connection (verified by looking at network packet traces and inserting a logging transparent proxy between the client and server). The network packet pattern we see is: <normal TCP setup - SYN, SYN/ACK, ACK> Request data (in a single frame, or multiple), ACK Then the connection is closed by the server after 10 - 70 ms, without any data being sent, with a FIN/ACK (still getting confirmation on this - these logs are from the other side of the man-in-the-middle proxy I'm using to get debugging info). For some of the failed requests, the server processing never gets as far as the start of our Tcl code (a preauth filter that starts with an ns_log that doesn't show up in the server log). For others the request is processed normally and an access.log message written indicating that a response was generated with HTTP code 200, but no packet shows up on the network. There is no pattern to the failed requests (sometimes requests for favicon.ico fail), and retrying the exact request shortly afterwards often succeeds. Has anyone seen anything like this before, or have any advice on how to narrow down the cause further? We're running a slightly patched version of the last 4.5.2 rc, on Ubuntu 12.04.5 64bit on Amazon EC2 instances, with Tcl 8.6.1 Thanks Cyan |
From: Jeff R. <dv...@di...> - 2014-06-24 08:51:39
|
Ok, I've been sitting on this response for far too long now, I may as well just send it out, perfection be damned. That was probably my thread on "4.6 and beyond" you mentioned. Lots of energy ... then life happened, and time vanished. I'd still love to make things happen, I'm just short on energy right now. I don't think the existing userbase has particularly held anyone back from making changes, bold or otherwise. After all, people can always not upgrade. (According to the reports, a significant portion of aolserver users are still using version 4.0 or even 3). So if you want to make changes - make them! The wonderful thing about source control is that you get to keep your old stuff around. About source control - this isn't the first time someone has suggested using something else. Frankly, the strongest arguments for moving away from CVS and to something else from from Chris Tsai at SF support. To paraphrase, "cvs hosting is absolutely awful to support for a lot of reasons, and needs regular maintenance windows. Pretty much anything else is better." The SCM system in use is not going to attract developers that would not have been otherwise interested, IMHO. That's not to say a change would hurt, but my strongest inclination at this point would be to move to svn or hg. (or fossil, but that seems to be even less mainstream than cvs, plus SF doesn't support it). If there's an overwhelming outcry of voices demanding a move, that increases the motivation. "Overwhelming" here would mean 2 or 3 :) Until then (or after) if you have any patches to submit, by all means send then and I'll be happy to take a look, and most likely merge it in - I've mellowed somewhat since the last time a patch was shared (sorry John) All this said, NaviServer *IS* much more active these days, and lacks the "smirk factor" that the name "AOLserver" carries. And a lot of the goals you mention (in particular, code cleanup) are regularly undertaken by Gustaf. WRT Maurizio's comments: > 1. social media are making legacy CMS and standard web sites less and less > important > 2. web and mobile application are moving towards architectures with rich > clients (e.g. html5 based like SensaTouch, Oracle ADF, SAP SUP) and these > architectures are moving away from the legacy web application and > development model offered by Aolserver/Naviserver - OpenACS. Legacy CMS and standard web sites are by no means going away. Even the html5-iest site needs a server behind it, and even if most of the display logic is on the client in the javascript framework of the week if you want to be able to persist and share your updates the best way is still to have a database backing it up. What is getting less important is the ADP programming style of interleaving logic and layout. acs-templating is a great approach to this tho, and there will continue to be a need to take data from a database and put it into some deliverable form. Which raises a few questions with suggestions of projects around them. The data format of choice these days seems to be json, with xml still being a significant player. So, how is our support for those things? XML - tdom works very well with aolserver, and is better than most xml handling anywhere, regardless of programming language. (It doesn't do XQuery, but I can live with that). My only gripe is that it's a standard tcl package rather than an aolserver module, but that's a difference hardly worth quibbling over. JSON - not so much. At least, not that people have talked about. There is a good json library available in yajl-tcl (aside from the array-list mapping that plagues all the tcl-json libraries), which I have every expectation would work well in aolserver, but I haven't tested. Plus it could probably benefit from integration with the native database operators, rather than only working with postgres handles. Anyone interested in putting together the pieces here? Speaking of databases, one of aolserver's biggest strengths has always been its database connectivity. Granted, in the early days that was because Oracle took 5 minutes and 5Mb for a connection so pooling was a huge win when no one else was doing it, but it's still a good thing. But that's about relational database, what about these fancy newfangled no-ess-kyoo-ell databases that I've been hearing so much about (in between yelling at kids to stay off my lawn)? Some of them have purely REST apis (e.g., couchdb) while others have C apis that might benefit from ns_db style pooling, or might just benefit from having a nice interfact (mongodb, redis). They may or may not have a query language, but the bdb driver shows a way with that. Anyone interested in writing drivers for these? (BTW: I *WILL* fix the sqlite driver soon - I promise! Although, since no one has complained about it, I doubt anyone is actually using it, which is a shame.) There's other cool stuff that could be integrated too. There's lots of interest in making CSS not suck so much, so there are things like LESS and SASS. Having tight integration with either of those would be nice. Or a similar flavor, based around a high-performance C-coded css parser, perhaps. Any interest in taking this on? Then there's programming languages. Don't get me wrong, I love tcl. But there are other things out there, and more importantly, applications written in other stuff out there. So why not take spidermonkey and write a module for that so we could run javascript on the server? (hey, why not run node.js on aolserver and be able to use multiple cores?) SPDY and HTTP/1.1 are good goals too. So ... All of the above are (I think) interesting projects. But they don't mean a thing unless someone is using them. OpenACS is great in that regard: it is a more or less complete application. But there are other things that can be done also, and I think it's restrictive to think of aolserver as just the engine to power OpenACS. I guess what this all comes down to is a few key questions: 1: what do you want to build? 2: what is needed to help you build that better/faster/stronger? 3: why aren't we doing it? We need an answer to #1, get #3 out of the way, and then #2 is the way forward. -J Ayan George wrote: > On 05/11/2014 10:50 AM, Dossy Shiobara wrote: >> On 5/10/14 9:38 PM, Ayan George wrote: >>> aolserver-talk has been quiet for a while. Has the discussion >>> moved somewhere else? >> Not that I'm aware of ... but, it's possible there's some >> NaviServer-focused list that has much more activity than this one. >> > > First, sorry for the rambly email. > > I remember a really encouraging thread titled "Roadmap - 4.6 and beyond" > but nothing seemed to come of it. > > Perhaps now that usage is fairly low it is time to start making bold > changes to improve and modernize the code AND to attract new developers > and users. > > I don't think inertia or alienating users would be a valid argument > against changes at this point. > > This is a good time to set some hard goals, delegate tasks, etc. This > time, however, maybe Dossy can suss out what seems worthwhile and make > it happen? > > Personally, I would like to see the following: > > * Define the scope and goals of AOLserver. Personally, I'd like > AOLserver to be a high performance, programmable, TCL based web server > for Unix-like operating systems. > > * Identify subsystems and assign a maintainer to each. Initially this > may be the same person or small group but the goal would be to delegate > ownership. > > * Completely commit to a traditional git workflow. Accept patches on > the -talk submitted using list for review. Have subsystem maintainers > apply patches. Dossy then pulls from each maintainer's tree to cut a > release. > > * Drop Windows support unless it is exceedingly easy to do. Right now > there are 86 #ifdef _WIN32 instances in the code -- IMHO, there should > be 0. Why worry about alienating an incredibly small (maybe > non-existent) sliver of an already tiny community? > > * Allow, encourage, maybe insist upon more modern programming > techniques. There is a lot of classic C89 or pre-C89 code in AOLserver. > Many open source projects balk at using C99 features citing compiler > support and developer familiarity but C99 is about 15 years old now and > well supported. I think AOLserver would be a more attractive if we > explicitly required C99 features that can improve the quality of the > code (inline declarations, declarations in for() loops, restrict > pointers, etc.). > > A simple goal like converting for() loops to c99 in-loop-declarations or > adding restrict keywords where useful would give developers an > opportunity to get familiar with the code without doing any heavy lifting. > > * Modernize the main event loop. Perhaps use libevent for socket > multiplexing and thread dispatch. This will allow it to take advantage > of superior multiplexing techniques like kqueue() and epoll(). > > * SPDY, HTTP 1.1, HTTP 2.0? > > * Support for deferred accepts (FreeBSD AcceptFilters, Linux > TCP_DEFER_ACCEPT). > > * Ongoing code clean-up, feature addition, optimization, and bug fixes. > > Bottom line though is that I'm not sure if we should be afraid to break > anything. I don't think anyone will notice. :^) > > -ayan > |
From: Maurizio M. <Mau...@sp...> - 2014-05-11 21:42:23
|
Dear Ayan, It is quite some time that the Aolserver and Naviserver communities have joined their efforts and all new developments occur nowadays mostly on Naviserver. I believe you should extend your list of "desiderata" to this list: https://lists.sourceforge.net/lists/listinfo/naviserver-devel. As far for the future of Aolserver/Naviserver I see their evolution very connected to the one of OpenACS. I think that the union Aolserver/Naviserver - OpenACS historically was two different things: 1. a feature reach platform for CMS and e-learning sites 2. an interesting and very productive/powerful platform to develop web applications But things have changed and keep changing: 1. social media are making legacy CMS and standard web sites less and less important 2. web and mobile application are moving towards architectures with rich clients (e.g. html5 based like SensaTouch, Oracle ADF, SAP SUP) and these architectures are moving away from the legacy web application and development model offered by Aolserver/Naviserver - OpenACS. Hope it helps, Maurizio PS: About deleting Windows support in AOLserver I would like to rememeber that perhaps the biggest user of Aolserver/OpenACS is still ]project-open[ and it might be interesting for you to look at these numbers: http://sourceforge.net/projects/project-open/files/project-open/V4.0/ -----Original Message----- From: Ayan George [mailto:ay...@ay...] Sent: 11 May 2014 19:38 To: aol...@li... Subject: Re: [AOLSERVER] Roadmap Revisited (was Re: Is there anyone out there?) On 05/11/2014 01:35 PM, Ayan George wrote: > * Completely commit to a traditional git workflow. Accept patches on > the -talk submitted using list for review. Have subsystem maintainers > apply patches. Dossy then pulls from each maintainer's tree to cut a > release. > Eh -- this bullet point is mangled. I meant accept patches submitted to -talk using git format-patch. ---------------------------------------------------------------------------- -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: • 3 signs your SCM is hindering your productivity • Requirements for releasing software faster • Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce _______________________________________________ aolserver-talk mailing list aol...@li... https://lists.sourceforge.net/lists/listinfo/aolserver-talk |
From: Ayan G. <ay...@ay...> - 2014-05-11 17:38:39
|
On 05/11/2014 01:35 PM, Ayan George wrote: > * Completely commit to a traditional git workflow. Accept patches > on the -talk submitted using list for review. Have subsystem > maintainers apply patches. Dossy then pulls from each maintainer's > tree to cut a release. > Eh -- this bullet point is mangled. I meant accept patches submitted to -talk using git format-patch. |
From: Ayan G. <ay...@ay...> - 2014-05-11 17:35:48
|
On 05/11/2014 10:50 AM, Dossy Shiobara wrote: > On 5/10/14 9:38 PM, Ayan George wrote: >> aolserver-talk has been quiet for a while. Has the discussion >> moved somewhere else? > Not that I'm aware of ... but, it's possible there's some > NaviServer-focused list that has much more activity than this one. > First, sorry for the rambly email. I remember a really encouraging thread titled "Roadmap - 4.6 and beyond" but nothing seemed to come of it. Perhaps now that usage is fairly low it is time to start making bold changes to improve and modernize the code AND to attract new developers and users. I don't think inertia or alienating users would be a valid argument against changes at this point. This is a good time to set some hard goals, delegate tasks, etc. This time, however, maybe Dossy can suss out what seems worthwhile and make it happen? Personally, I would like to see the following: * Define the scope and goals of AOLserver. Personally, I'd like AOLserver to be a high performance, programmable, TCL based web server for Unix-like operating systems. * Identify subsystems and assign a maintainer to each. Initially this may be the same person or small group but the goal would be to delegate ownership. * Completely commit to a traditional git workflow. Accept patches on the -talk submitted using list for review. Have subsystem maintainers apply patches. Dossy then pulls from each maintainer's tree to cut a release. * Drop Windows support unless it is exceedingly easy to do. Right now there are 86 #ifdef _WIN32 instances in the code -- IMHO, there should be 0. Why worry about alienating an incredibly small (maybe non-existent) sliver of an already tiny community? * Allow, encourage, maybe insist upon more modern programming techniques. There is a lot of classic C89 or pre-C89 code in AOLserver. Many open source projects balk at using C99 features citing compiler support and developer familiarity but C99 is about 15 years old now and well supported. I think AOLserver would be a more attractive if we explicitly required C99 features that can improve the quality of the code (inline declarations, declarations in for() loops, restrict pointers, etc.). A simple goal like converting for() loops to c99 in-loop-declarations or adding restrict keywords where useful would give developers an opportunity to get familiar with the code without doing any heavy lifting. * Modernize the main event loop. Perhaps use libevent for socket multiplexing and thread dispatch. This will allow it to take advantage of superior multiplexing techniques like kqueue() and epoll(). * SPDY, HTTP 1.1, HTTP 2.0? * Support for deferred accepts (FreeBSD AcceptFilters, Linux TCP_DEFER_ACCEPT). * Ongoing code clean-up, feature addition, optimization, and bug fixes. Bottom line though is that I'm not sure if we should be afraid to break anything. I don't think anyone will notice. :^) -ayan |
From: Dossy S. <do...@pa...> - 2014-05-11 15:17:02
|
On 5/10/14 9:38 PM, Ayan George wrote: > aolserver-talk has been quiet for a while. Has the discussion moved > somewhere else? Not that I'm aware of ... but, it's possible there's some NaviServer-focused list that has much more activity than this one. -- Dossy Shiobara | "He realized the fastest way to change do...@pa... | is to laugh at your own folly -- then you http://panoptic.com/ | can let go and quickly move on." (p. 70) * WordPress * jQuery * MySQL * Security * Business Continuity * |