From: Manuel E. S. <ra...@bi...> - 2002-08-08 11:39:01
|
On Tue, Aug 06, 2002 at 05:27:57AM -0000, Craig Foster wrote: > Okay, I've had a couple more evenings to sit down and play with apt-proxy- > v2. I figured out a couple of things (such as the existence > of /etc/init.d/apt-proxy-v2 (-; ), and I have also come across a few > oddities. Some of these I think I mentioned in my last email to the list, > but I'll elaborate more here: > > 1) apt-proxy-v2 does not seem to serve files from its own cache to clients. > This happens for Packages.gz and Release files as well as packages. Those files are mutable (meaning that a file with the same name may change over time, other versioned files like .deb's don't change) and receive different treatment. > Lines 15-20: > 06/08/2002 00:14 [AptProxy,0,127.0.0.1] [debug:9]CHECKING_CACHED > 06/08/2002 00:14 [-] [verify:9]Process Status: -1 > 06/08/2002 00:14 [-] [verify:9]unknown file: not verified > 06/08/2002 00:14 [-] > 06/08/2002 00:14 [-] [verify:9]verication failed > 06/08/2002 00:14 [-] [debug:9]NOT_CACHED Integrity verification seams to have failed, which makes the cached file irelevant. But the file is considered unknonw by the file verifier so the validation should not fail. There is something wrong with the FileVerifier class. > 2) apt-proxy-v2 gives 403 Forbidden responses for certain files. For > instance, trying to do an "apt-get install gs" always yields: > > Err http://localhost woody/main gs 6.53-3 > 403 Forbidden > Failed to fetch http://localhost:8000/debian/pool/main/g/gs/gs_6.53- > 3_i386.deb 403 Forbidden > > This behavior is the same for both sid and woody on multiple backend > servers. The file in question does exist and is accessible via HTTP on the > backend servers. The apt-proxy-v2 log shows the following: > > 06/08/2002 00:39 [AptProxy,2,127.0.0.1] [debug:9]Connection: keep-alive > 06/08/2002 00:39 [AptProxy,2,127.0.0.1] [debug:9]User-Agent: Debian > APT-HTTP/1.3 > 06/08/2002 00:39 [AptProxy,2,127.0.0.1] [debug:9] > 06/08/2002 00:39 [AptProxy,2,127.0.0.1] [debug:9]/../ in simplified uri It is refusing to serve the file for security reasons, because after trying to simplify all '..' ocurrences some where left, and that is a security problem. simplify_path and a complicated uri are provably to blame. > 4) twistd eats a lot of CPU cycles for a rather long time after downloads > are complete and the client has disconnected. What is it doing? It is generating Packages.gz from Packages or viceversa depending on the backend involved. It shouldn't happen when downloading .deb files. > 5) I see mention of memory leaks in the TODO file, so I assume everyone > knows about this, but: Memory usage by twistd gradually grows over time as > apt-proxy-v2 serves more and more requests. My system finally OOM'ed and the > vm killed twistd after a couple of "apt-get install -d kde"'s on my firewall > machine (which does not have any of X installed). I haven't been very careful with memory leaks, mainly because I don't really know how to control them in python, this confirms my fears :( > 6) Does apt-proxy-v2 really NEED to be a daemon that runs constantly (as > opposed to a daemon controlled by inet.d). I understand that there is > maintenance to be done on the cache, but it would seem that all of that could > be handled at the end of requests and/or by cron scripts. Likewise, I > understand that there may a slight performance degradation if the process > needs to be started by inet.d, but for small sites, having the large daemon > in memory all the time is overkill. Perhaps there could be an option to > either run apt-proxy-v2 as a daemon or from inet.d (as is done in the samba > and apache packages). Running apt-proxy-v2 from inet.d would also help > lessen the effects of the apparent memory leak in apt-proxy-v2 since a new > daemon would be started with each request and ended afterwards. This is > similar in principle to the way apache kills off child processes after they > have served so many pages in order to keep memory leaks under control. What > am I missing here that necessitates having twistd running all the time? The inetd afair, was a limitation of twisted, apt-proxy-v2 is based on it, and twisted did'nt support inetd based daemons when development started. apt-proxy-v2 is not so big when you start it, the big problem is the memory leaks. The daemon does a very light weight cache waking to make sure that there are no files there which it is not keeping track of, and it would have loking issues to do outsite of the main daemon. There are other looking issues which get quite simplified by running a permanent daemon. That said, it would be nice to have a cut-down inetd mode, but it will probably not happen soon. And from your previous email: apt-proxy-v2 should happyly work with an apt-proxy-v1 cache directory, and I believe that apt-proxy-v1 will do the same with an apt-proxy-v2 cache directory, but Chris should confirm the second. What will fail in strange ways is runnig both at the same time on the same cache directory, because they don't do locking with one another and will certainly step on each other's toes. Hope that helps ranty PS: I am not willing to work on apt-proxy-v2 right now because working remotely was not taking me anywhere, I will be back with my development machine on mid augost. You are very welcomed to try to debug apt-proxy-v2, just keep in mind that I will be more helpful when I come back with my development machine/permanent internet access. -- --- Manuel Estrada Sainz <ra...@de...> <ra...@bi...> <ra...@us...> ------------------------ <man...@hi...> ------------------- Let us have the serenity to accept the things we cannot change, courage to change the things we can, and wisdom to know the difference. |