Thread: [Cppcms-users] Front-end http server without upload buffering
Brought to you by:
artyom-beilis
From: Julian P. <ju...@wh...> - 2010-11-23 18:00:30
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hallo, I want to use a cppcms application in a productional setting. As it's discouraged to use cppcms's http API directly, I need a fast/scgi-capable, lightweight webserver that will handle the HTTP-Requests, serve static files and forward the requests to cppcms. The problem that occured is, that two of the most prominent webservers, lighttpd and nginx, need to buffer POST uploads completly before they are passed to the fastcgi backend, and both projects don't want to implement a direct passthrough for the POST bodies. For me this is a problem, because the named cppcms application is a firmware upload server, which has to process the firmware file during upload because the flash memory on the device this server runs on is very limited, and if I understand it right, an upload would take at least 2x the size of the firmware file being uploaded, because lighttpd/nginx will store it, then pass it to the cppcms fastcgi backend where cppcms stores it to a temporary file again. So for a file of e.g. 120M we would need at least 240M, but the drive has only the 120M for one copy of the firmware file. Do you know of any lightweight webserver that will allow to passthrough the body to cppcms without buffering? Or is the only solution to run cppcms in standalone mode as its own webserver? Where'd be the risks to do so? Thanks for your answers, Julian -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJM7AFpAAoJENidYKvYQHlQS2wP/Azz+qbylcYIlHdJXv/UvjGl PIhG+CA7wkybhPC9ZgOuTmt8ft41RrkPlL38SZ2TBK354nr01nWvpUoD6VYyRYtX zf9hd6c1HSoy6qfucyHHF8YWtFHHTpLJYVSnHqRP0JKVi3m2xgMEOikhGjphbvPN dSTQKgUqupzcDb81XV8TUI37hm0A4/rhbyyh0JlcZDTiLuS/3LeA7GqwmtwFKvd8 N/vMB96Ddgrp4eKdTQWIk8PGgtuQAnFgMRFjcMGiVvDLJ8utiTsqlgFXcgPUx9jF 1WF/l3kRngYPh3qYp8cFDOLf/C8o2oOKDiC5TzKKfJJTbEyl42nB5EGMUt1y7Uy/ tojBcG+agnrrhXknnAr8DwvFcsP9qYJlbEgmwq8aO+HwFrsNFG0tVIKsaaB5cto7 Krq4ZwazUjO/uIPhEpke7uqm/1Giv6vuU7dQtXOFXzRkQCJV9S4Z9BX7uofvgLAT ZPgT58/RmzXrv74zuoiyIhvyqBCUmp5JMYiVOYumxCIikimCU4z3eMea15Gctxho dICG1BsrSfhf4dkcq6m/o5q4gIG+hAQjcuBIY5SJDxmxobXdfJy9+uMj0QEEvLRh Pf/OjynrSzapWFwCpMSid0+rnZme1dXaoyexA9clVYuL9dslREJclfw6uzaccENj /W2FSA9q3Nb6weDJdndj =di79 -----END PGP SIGNATURE----- |
From: Artyom <art...@ya...> - 2010-11-23 19:02:15
|
> > Do you know of any lightweight webserver that will allow to passthrough > the body to cppcms without buffering? > Have you tried Cherookey or Apache? (I don't know if they do this or not) > Or is the only solution to run cppcms in standalone mode as its own webserver? > Where'd be the risks to do so? If your proxy sanitizes the HTML input and handles timeouts it is fine. To be honest, even if it handles timeouts I think there shouldn't be any problems. Generally I don't think there should be any security holes in internal HTTP server with exception of: 1. It does not handle timeouts - so it is very vulnerable against DoS attacks. 2. It supports only basic HTTP/1.0 and I'm not sure that its internal file server has good enough security checks. 3. It lacks SSL/HTTPs support. So generally if: 1. If you run it behind proxy that handles DoS and timeouts for you. 2. If you **do not** serve files (i.e. use internal file server) 3. Do not require SSL, HTTP Authentication or any other advanced feature 4. You do not require strict handling of various HTTP headers (i.e. composing several same-type headers to one as required by HTTP specification) Then it should be fine as I don't think that other web servers do much more checks on protocol itself then I do. However I hadn't done any stress testing for HTTP server and it is not a server that tested by wide audience for security etc. Bottom line: 1. If you use HTTP protocol in **trusted** environment it is Ok. I mean, if your user has physical access to the device, I'm not sure how much protection can you provide at all, especially when you **do** upload code that would run on the device. 2. If you use it behind HTTP proxy that fully sanitizes the input and you don't serve file from internal file server it is Ok. 3. If you use it behind proxy with great care (make sure that timeouts handled properly and you don't serve files) then it should be fine as well. And finally, if you really paranoid about security, define chroot to some directory that you know user can't do too much harm (CppCMS supports chrooting, but on the other hand - you upload firware - the code that would run... so it meaning-less). Additional notice: ------------------ I do plan in some future to make the HTTP server much more secure and introduce HTTP/1.1, better file serving and probably even SSL (mostly as part of creation of Web-Sockets support that can't be used with current FastCGI or SCGI protocols) But it is lots of work I have no time to do at this point. Generally there is not a lots of work to make it production safe, but this is mostly requires support study of what is required to be implemented in HTTP web server to make it secure - i.e. what and how to handle. Artyom |
From: Julian P. <ju...@wh...> - 2010-11-23 19:46:23
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ok, thanks for your very detailed reply. The environment the HTTP server will be available in is to be considered a trusted environment, so this would be OK. I wrote an own implementation of a static file server as cppcms application, which should be secure enough (does realpath expansion and checks it with a configured web-root to make sure that users don't access any files they shouldn't, only thing is it doesn't check for symlinks, but as I need them anyway, it would not make sense to check for them). The missing SSL support is a pity, but as it's a trusted environment this is ok for now. I wrote to the Cherokee mailing list to ask them whether uploads are buffered because it's nowhere in their documentation and hasn't been on topic on the mailing list so far, but I expect a positive response, because Cherokee is implemented asynchronously as lighttpd and nginx are, and there seems to be the need to do that buffering to anticipate event loop locks. Are my calculations concerning the disk usage correct? I think lighttpd stores the uploads in chunks of a certain size (afaik 1M as default, but you can change it), so the question would be whether it deletes the files one by one after having send them to the backend application or whether it deletes all of them after the transmission to backend application has been completed. In the first case, I could use lighttpd, because I could spare a few Megabytes for intermediate caching. Only I don't know how well this behaviour is going to be beared by the flash memory which has a limited number of write cycles for each of his blocks, so I see problems in too frequent (and in this case, unneccessary) file I/O on it, as every firmware would have to be written two times to flash. In fact, because of the lack of time and any alternative, I used the HTTP api of cppcms up to now, but wanted to switch to a proper webserver because of the remarks in your configuration file overview. After having read your second email, I think I will stay with cppcms HTTP for now, and HTTP/1.1 is not needed, because I won't use keep-alive anyway, while SSL support would be nice, even if it's not desperately needed now. Thanks, Julian -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJM7Bo9AAoJENidYKvYQHlQvWQQAJjVgBT4pg0zmlJek2yNwDTs z0i4w54E2vyFrz+07ulE/yYWD2xxDZjkPWW5a5vU3eNrRTJfWyAD32RgN74g7rbM ycjZ6Qyqp3o6sJwuUVYDxYvM97CEG1dSdMVow7FZb5z9yZPI5XA8mjxoAfagqMxH w6/rAWMj9sDKBjSV/0apQ77lXW3Ih8kjdy/6CX5ec2UvwnqBvzfbTz7vgkCogg2X nhzqQeo71lu1YHPck44K8UVA3VsF36ly1OWImZRArnilruPWnRKg/0MkSz8YHwg6 5p02nKQiI9KDd5YMhdMZhj+bHv6o+6gZKkMlOHOT0TpdpW7eVmFHY21aQ+K2xj+F 2wx9BFa72oZmu/a4ABYNjq89Smy7SieK8a5+oyuSbc5kC7U5aRnAjj9ThX33QA9n ZivqJCB8SM1VtNfc64ZMbqOWOtV+72AiT8l+gXRkbQIbbpiQqlK4r8Di64nNDatv /Mw/Te1ybO5xySEp9ICMHMTENFlBXphxlc2aEON/8YFGVP8ajyocR1dlejANXKpv 5z7IAHDwz38P833swX5+KIpqLFq26tlhp0+lV1U6QTf2qRDYnX3HNCy1si+k7oSQ zSP2qYCgOAr2Q+S6K/SkpCJjdzuIXbvtMRBcBs1IvsmiPRaahrWWnwBCPplZGGR+ bnCJMJm8wN4zs3lDac4p =RUeP -----END PGP SIGNATURE----- |
From: Artyom <art...@ya...> - 2010-11-24 18:53:40
|
> I wrote an own implementation of a static file server as > cppcms > application, which should be secure enough (does realpath > expansion and > checks it with a configured web-root to make sure that > users don't > access any files they shouldn't, only thing is it doesn't > check for > symlinks, but as I need them anyway, it would not make > sense to check > for them). Actually CppCMS has quite good file server that does all you mentioned, but under linux I use canonicalize_file_name that does the same as realpath but allocates memory, as there some cases where realpath can be used to exploit buffers overflow (but it is generally very hard to do). BTW realpath and canonicalize_file_name do symlinks expantion. As I told you I wouldn't expose CppCMS HTTP server to wild internet, but is not so bad. :-) > Are my calculations concerning the disk usage correct? Looks right, but do not forget using cppcms::http::file::save_to function as it moves temporary file rather then copy it if it is placed on same device. Regards, Artyom |
From: Julian P. <ju...@wh...> - 2010-11-23 22:24:16
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Just to inform you: According to the guys from the cherokee mailing list, cherokee does NO caching but passes the received POST data immediately to the scgi backend. Julian -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJM7D8+AAoJENidYKvYQHlQXPkQAJ/53i90GVq+Y2mj5cX/IgSF op6a+jt/sVKkSYbbRGEMWrgm5SGluKEWOhbqDmxeWlK7xUb5o2pdXn4BAo/QqtSC hed4XwIrRYISOpSIWs75nu0XRkb4XzNzwAg8YemesmdqZDmgXqoNV3PbOT4/co1Z t6CCMArnFeJYmqn2Fj71kLb73H77h6PN2MNiurbDBNybwyoFNUK2YCCYsX8YiiDW NwSc7hG2Da/wxFNgfXQBO2uwWpuKlzjZl+wEjnDZBtaD9kXiuTYHbO64hkA1QCba t5gwAodCHG5gonr7gmeVFKs15JxXbDIsuMiDOUU4Eiup951iFHjBdV2wOw9v6Cn9 1kyeCVABLRK8WDLe+4EqTZefT9AoF4EWBkGKmm7wVaLspU6FO1B6Nz5IzHSlyPmq ZjeJsN5Ed3hhzVVXHJMkWecVgwYYr0x9jKqw4FfdQTC4OX1cSY055QKhBUEaEJo1 Cr3s2YAQnbLhGN6gXvywXsr/6gykjAgMIfcGTLEe1uYDoxPaMLbgj/XgBLl0Qhdt d4Z7+glh1h/vAo5BZLwaZ8chjrmoOgNiM4zXYp4kFGvFld7kY8caB8jqZT69MUZD tLV8+DmVmK2+0patHLhTXwVjjXNdnFLgFP+UrNd5Vc3tmSDvo7rqHdov3uk2SCzf j5xOQ1D/MjXoA9u7Yyg4 =9sub -----END PGP SIGNATURE----- |
From: Artyom <art...@ya...> - 2010-11-24 18:57:49
|
> > Just to inform you: According to the guys from the cherokee > mailing > list, cherokee does NO caching but passes the received POST > data > immediately to the scgi backend. > > Julian Ok this is very good. I'd just recommend you to make some basic tests with cherokee - especially HTTP redirections and HTTP status. because some SCGI web servers implementations do not handle it right (Nginx and IIS's plugin). I must admit I hand't tested CppCMS with Cherooke (who makes Web server GUI configured??? It makes it very hard to test it with cppcms_run script) but don't expect problems as it works with Apache, Lighttpd and Nginx well. Regards, Artyom |
From: Julian P. <ju...@wh...> - 2010-11-24 19:45:17
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 24.11.2010 19:57, schrieb Artyom: > Ok this is very good. I'd just recommend you to make some > basic tests with cherokee - especially HTTP redirections > and HTTP status. > I had only little problems so far (using scgi), because cherokee seems to set another SCRIPT_NAME or PATH_INFO than lighttpd does (URL rewrite works in a quite different way). Is there a way to see in cppcms which SCRIPT_NAME/PATH_INFO is sent? So that I could adjust the paths properly? I could sove the problem temporarily by mounting my application with the mount path ^(.*)$ and passing the first match to the application. To me, it doesn't seem to make any difference, but mounting the application without any mount path results in 404 errors by cppcms. Besides of this one no problems occurred so far, but I didn't test it extensively yet. > I must admit I hand't tested CppCMS with Cherooke > (who makes Web server GUI configured??? It makes it very hard to test > it with cppcms_run script) The configuration GUI is very good indeed, Cherokee has a very sophisticated way of chaining virtual servers and handlers which can be configured much better by web interface/GUI than by configuration file IMHO. However, especially for script based approaches, you may edit the cherokee.conf directly. There are guides on their project site http://www.cherokee-project.com how to do this. Regards, Julian -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJM7Wt/AAoJENidYKvYQHlQQI0P/RgraH2x7YKj762CvQCQE2GP GowT6ke6WVLEQ4kv0rL8eCQ6Volq1tyGmU77CVDHEvj0u0PWOaOswO3z4GbD6k9T oLJNc+pbXjGCUviOjCZu1CaqkpkAidKVe2b2AsZO5bHPuDXHYLICpuixfDTkzCew oa/dOuNiSXjP4EsbZnDoQR1in8WAG+rr53kf3cGP2JQdfunT9ISaA3S/Rl2jPAau NBFnTfu/+jSrsOs+Ulkr/TSZJag+oyj3tYHFFmkyAN7SGWBVlyr4HDWENq62Ur5y EN0etJcuoqqrYDbJOJi7djo+Yhx9fT37ZdUELGz/ejJERYniymXJ/U9+nwQgQA6/ Awl9poVDABvXb9UMFGIHeyfKBGSQ2+h0MFwPfR7RjheDDRcYmVfBTxZx6ebsPAqY ATVgIN3N/dg2US3mhVe4ci6vqOxHY+W4pJK8QHJwTl0ae9UnYk9KhCmVL1tQTMmc FB8y3MCNSYibnVeR11J7lGwv+MRX1C7j+ECt6y55cjE/kQzyJ6YOGfsz9+gLorYD lkC4IS/P/MmjzPC5AkaLQ8Ikx/5dDAkfEjxeNR/XgKbtxtumWNzva9uA9LF/o3p7 /gd9wMMoeDytAcxso0/RAMVltJE0ph8PAB0SCVDv+sFTCoIJj9by1+AQLT5G4ZBH Ws2MV7WQmrHxZpompakX =xjk3 -----END PGP SIGNATURE----- |
From: Artyom <art...@ya...> - 2010-11-24 20:14:19
|
> I had only little problems so far (using scgi), because > cherokee seems > to set another SCRIPT_NAME or PATH_INFO than lighttpd does > (URL rewrite > works in a quite different way). Is there a way to see in > cppcms which > SCRIPT_NAME/PATH_INFO is sent? request().script_name() and request().path_info() is the answer if I understand it right. > So that I could adjust the > paths > properly? I could sove the problem temporarily by mounting > my > application with the mount path ^(.*)$ and passing the > first match to > the application. To me, it doesn't seem to make any > difference, but > mounting the application without any mount path results in > 404 errors by > cppcms. You can provide quite complex and flexible mount point for CppCMS application. See cppcms::mount_point class docs. http://art-blog.no-ip.info/cppcms_ref_v0_99/classcppcms_1_1mount__point.html http://art-blog.no-ip.info/cppcms_ref_v0_99/classcppcms_1_1applications__pool.html If you have any dougbts about what happens create a simple class that "mounts" to everithing (default) and print in it's main script_name and path_info. > you > may edit the > cherokee.conf directly. There are guides on their project > site > http://www.cherokee-project.com how to do this. I know I just hand found it too cryptic and not well documented in comparison to other confugration files designed for human being. Artyom |
From: Julian P. <ju...@wh...> - 2010-11-24 20:37:17
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 24.11.2010 21:14, schrieb Artyom: > request().script_name() and request().path_info() is the answer > if I understand it right. The problem is, that the request is not dispatched to my application, see below. > > You can provide quite complex and flexible mount point > for CppCMS application. See cppcms::mount_point class docs. I know that and in other servers I use it quite extensively ;). > If you have any dougbts about what happens create a simple > class that "mounts" to everithing (default) and print in it's main > script_name and path_info. > That was the problem: I mounted the application without mount point and added a main implementation to the application, which did never get called. It was called, however, as soon as I used ^(.*)$ as mount point, which, at least as I see it, means exactly the same: mount on everything. So I'd need a way to print the URL that is used for dispatch if no application is mounted and cppcms therefore returns a 404. > > I know I just hand found it too cryptic and not well documented > in comparison to other confugration files designed for human > being. > I had just a short look into it after I configured a cherokee server to perform further tests with my services, and while it is not that beautifully structured (no line breaks etc.), it's very logical indeed. If you're interested in adding cherokee to the cppcms_run utility, I'd suggest you do a basic setup by web GUI and then look in the configuration file to see what you'd have to change within your script. Because it's written by a machine entirely, its structure is quite logical, though. Regards, Julian -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJM7XevAAoJENidYKvYQHlQiUgQAMyhpTHmgJeVhj5mS6SfvGD1 AL+70F7mIkdevPRxZAyx/V8grcV5cDB4dwx3M4xP2Gu6PwDZeuVfO3TxaCIDSJFn oboaXRYgPgcB6r6cFUixmKi8l4G3tOMWlzAeVgegd+UcS3+ehtaugfZLYbU26oIZ Udn49N0k55auRgzDJzFmw84SImhSHS4KROHEm2szG+8cCDWRk3fL8gQEXT10/2nt 0lBzdtgPt8+6Zb/KKNsKUC4lqfWD0+OPKuGQULZ5sGyogyGwEId65Fay6nfigGWX AqfxO5hEd7gu8NdM2Nf13ImAVGTZMHkfP4jkcLmWqeLz1Hm7sgGc+O0a8wy93wWy RVFcFCJs2Klsc1UTfyX8Etb2GFFZsOj4uUIPHvxQMbEWmkwdEQhvcOWxGpFo0c86 w6tyjs+IitywQ2g6/KEDKqG/25QZRG0CLx8P4BPgBdn98ZsZX50Yd8dn9YdoJIuf 4rqrKqq3U+UNoXFpJ89dUMeol2EVLenKZkYVBzHifZmrW54DKCAjrEekKKMSSI42 cUeDBz33keesRaU5zRKq9+XHbZRwQS/A+LZqXCvlIXVMZK5bqpiHTDslqjX4QW6n CXV2V2BbXV7Tqlu1T21kTiWTGGjTPpcbjPf8lEW3fZCa/xtu6F/Ksu2p5eqToY9L N8v266B+YXM8bVR3rKPJ =OhGP -----END PGP SIGNATURE----- |