You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
|
Feb
(5) |
Mar
|
Apr
(2) |
May
(3) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
2002 |
Jan
(2) |
Feb
(5) |
Mar
(14) |
Apr
(1) |
May
(7) |
Jun
(2) |
Jul
(7) |
Aug
(13) |
Sep
(21) |
Oct
(3) |
Nov
|
Dec
|
2003 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(5) |
Oct
(5) |
Nov
|
Dec
(3) |
2004 |
Jan
(11) |
Feb
(2) |
Mar
(4) |
Apr
|
May
(9) |
Jun
|
Jul
(1) |
Aug
(12) |
Sep
(6) |
Oct
(7) |
Nov
(10) |
Dec
(3) |
2005 |
Jan
(22) |
Feb
(20) |
Mar
(5) |
Apr
(10) |
May
(15) |
Jun
(14) |
Jul
(9) |
Aug
(3) |
Sep
(7) |
Oct
(1) |
Nov
(3) |
Dec
(12) |
2006 |
Jan
(1) |
Feb
(1) |
Mar
(5) |
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
(2) |
Dec
(2) |
2007 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Chris H. <ha...@de...> - 2005-02-23 00:14:04
|
On Tuesday 22 Feb 2005 17:26, Andrew Malcolmson wrote: > The profile module was pulled after it was discovered earlier this month > that it was released under a non-free license. Ah, thanks I didn't know that > My ipython broke too, which makes me wonder if a notification of this > change was sent out to maintainers of Python packages so they could > prepare patches. I really don't think it's acceptable that packages in > Sarge should be suddenly broken like this. I didn't get any notice about this. I guess there are too many maintainers to be able to tell them all. I've uploaded a new package version which depends on either the older python version or the newer fixed twisted version. Chris |
From: David B. <da...@ba...> - 2005-02-22 22:55:56
|
Hi, I'm trying to get aprt-proxy running on my home network, but am having problems. I'm using Debian/Sarge and apt-proxy 1.9.25. Thanks to Chris Halls' README [0], I've got to the point where I want to import my existing package cache into the a-p cache, but apt-proxy-import fails, as follows: blue:/var/cache# apt-proxy-import -i /var/cache/apt/archives/ Updating twisted's process module. No updating required. Traceback (most recent call last): File "/usr/sbin/apt-proxy-import", line 69, in ? factoryConfig(factory) File "/usr/lib/python2.3/site-packages/apt_proxy/apt_proxy_conf.py", line 141, in factoryConfig factory.addBackend(backend) AttributeError: DummyFactory instance has no attribute 'addBackend' What would be the best way of getting a-p running? 1) Some fix for a-p 1.9.x (I don't know any Python but should be able to have a stab at it) 2) Manually copy my existing cache (would this work?) 3) Revert to v1.3.0 4) Other? Any suggestions would be appreiciated. Cheers, Dave. [0] http://cvs.sourceforge.net/viewcvs.py/apt-proxy/apt-proxy/README?rev=1.8.2.3&only_with_tag=apt-proxy-v1&content-type=text/vnd.viewcvs-markup |
From: Andrew M. <an...@im...> - 2005-02-22 17:27:21
|
On Mon, 21 Feb 2005 09:08:26 +1300, "Toby Allsopp" <to...@mi...> said: > >>>>> "Micha" == Micha <ear...@gm...> writes: > > Micha> What's going on ? The profile module was pulled after it was discovered earlier this month that it was released under a non-free license. My ipython broke too, which makes me wonder if a notification of this change was sent out to maintainers of Python packages so they could prepare patches. I really don't think it's acceptable that packages in Sarge should be suddenly broken like this. > Micha> After tonights update of debian sarge, starting the ap2 > initscript: > > Micha> Starting apt-proxy > > Micha> Traceback (most recent call last): > Micha> File "/usr/bin/twistd2.3", line 34, in ? > Micha> from twisted.scripts.twistd import run > Micha> File > "/usr/lib/python2.3/site-packages/twisted/scripts/twistd.py", line > 26, in ? > Micha> from twisted.application import app, service > Micha> File > "/usr/lib/python2.3/site-packages/twisted/application/app.py", line > 21, in ? > Micha> import sys, os, pdb, profile, getpass, traceback, signal > > Micha> ImportError: No module named profile > > This is fixed in the version of twisted in unstable, so you'll need to > upgrade python-twisted, python2.3-twisted and python2.3-twisted-bin > (from memory, I'm not at my machine right now) to their most recent > versions. > > Toby. > > ------------------- Andrew Malcolmson |
From: Micha <ear...@gm...> - 2005-02-20 22:42:58
|
Thanks, that did it. =B0 /\/ |
From: Toby A. <to...@mi...> - 2005-02-20 20:07:56
|
>>>>> "Micha" == Micha <ear...@gm...> writes: Micha> What's going on ? Micha> After tonights update of debian sarge, starting the ap2 initscript: Micha> Starting apt-proxy Micha> Traceback (most recent call last): Micha> File "/usr/bin/twistd2.3", line 34, in ? Micha> from twisted.scripts.twistd import run Micha> File "/usr/lib/python2.3/site-packages/twisted/scripts/twistd.py", line 26, in ? Micha> from twisted.application import app, service Micha> File "/usr/lib/python2.3/site-packages/twisted/application/app.py", line 21, in ? Micha> import sys, os, pdb, profile, getpass, traceback, signal Micha> ImportError: No module named profile This is fixed in the version of twisted in unstable, so you'll need to upgrade python-twisted, python2.3-twisted and python2.3-twisted-bin (from memory, I'm not at my machine right now) to their most recent versions. Toby. |
From: Micha <ear...@gm...> - 2005-02-20 19:31:57
|
What's going on ? After tonights update of debian sarge, starting the ap2 initscript: << Starting apt-proxy Traceback (most recent call last): File "/usr/bin/twistd2.3", line 34, in ? from twisted.scripts.twistd import run File "/usr/lib/python2.3/site-packages/twisted/scripts/twistd.py", line 2= 6, in ? from twisted.application import app, service File "/usr/lib/python2.3/site-packages/twisted/application/app.py", line = 21, in ? import sys, os, pdb, profile, getpass, traceback, signal ImportError: No module named profile >> There are no missing (broken) dependencies, according to apt. Because i have not the tme to track this further, i tried to isnatll some m= ore stuff to see if it would solve something: python2.3-4suite ein (0.99cvs20041008-3) ... python2.3-dev ein (2.3.5-1) ... However, that doesn't work. I didn't really experct it to, though. What else can i do ? ps. Some other packages that are installed: python2.3-libxml2 ein (2.6.16-2) ... libxml2-python2.3 ein (2.6.16-2) ... python2.3-libxslt1 ein (1.1.12-5) ... libxslt1-python2.3 ein (1.1.12-5) ... python2.3-adns ein (1.0.0-6) ... python2.3-dhm ein (0.5-2.1) ... python2.3-medusa ein (0.5.4-2) ... python2.3-nevow ein (0.3.0-1) ... python2.3-crack ein (0.5-1.1) ... =B0 /\/ |
From: Chris H. <ha...@de...> - 2005-02-03 12:31:18
|
On Thursday 03 Feb 2005 02:51, Micha wrote: > Chris Halls <ha...@de...>: > > The mtime is not really all that useful. There are lots of packages that > > may be quite old, e.g. all of Woody, where it is more important to know > > whether clients are actively downloading the package rather than the time > > since it was created. > > I must admit i don't really understand that feature. > If a package wasn't requested for a month that doesn't mean it couln't > happen just today. And with huge amount of disk space, my decision would be > to have woody, or have not. I wouldn't delete olf files automatically. > And then, usually the reason for woody is stability and security. > So i still would like to have the last security fix. > There are quite a lot of packages that ap had to refresh, anyway. > But i assume that you simply kmow of users who want it that way, and for > good reasons. Yes, there are users who don't have enough disk space to keep the whole of Woody around, and need the cleaning algorithms to remove packages they haven't downloaded for a while. > > My guess would be a problem with the database permissions bugs that took > > several attempts to fix. The uploads were only to unstable, and life in > > unstable is not guaranteed to be perfect I'm afraid. > > Well, in two weeks i'll see if it happens with testing too. It's not in testing because it isn't stable enough yet. > > I don't want to have to involve the sysadmin. They could be not > > available > > Yes, that's a good point generally. But....think of the jobs :) Heh :) Aren't they all busy recycling their ap databases every restart? *duck* Chris |
From: Micha <ear...@gm...> - 2005-02-03 03:34:23
|
Chris Halls <ha...@de...>: > Anyway, the ideal solution is probably to implement both methods :) Anay= =20 > volunteers? We'd need a kind of interface to attach to the daemon, and request a list o= f packages. Like 'apt-proxy -u foo bar baz qux quux corge grault garply wal= do fred plugh xyzzy thud'. I can't do that, sorry, i was too busy with odd things in Brusseles than to= learn programming. :) =B0 /\/ |
From: Micha <ear...@gm...> - 2005-02-03 03:34:19
|
Chris Halls <ha...@de...>: > The mtime is not really all that useful. There are lots of packages that may > be quite old, e.g. all of Woody, where it is more important to know whether > clients are actively downloading the package rather than the time since it > was created. I must admit i don't really understand that feature. If a package wasn't requested for a month that doesn't mean it couln't happen just today. And with huge amount of disk space, my decision would be to have woody, or have not. I wouldn't delete olf files automatically. And then, usually the reason for woody is stability and security. So i still would like to have the last security fix. There are quite a lot of packages that ap had to refresh, anyway. But i assume that you simply kmow of users who want it that way, and for good reasons. > Or perhaps you were thinking of setting the mtime to the time > when ap wrote it, instead of the timestamp of the file itself? yes, that's what i meant with 'touching' it. > My guess would be a problem with the database permissions bugs that took > several attempts to fix. The uploads were only to unstable, and life in > unstable is not guaranteed to be perfect I'm afraid. Well, in two weeks i'll see if it happens with testing too. > when large numbers of files are copied into the cache. However, since the > recycling is just taking a note of the time the file entered the cache, it > doesn't matter if it takes a while before apt-proxy notices. The database > entry is not needed when serving the file, only when doing cache cleaning. You mean it's supposed to run in the background ? Then why got Jonathan a 'service unavailable' (or whatever, sorry, original posting's already gone) ? btw. your first reply wasn't right quoted, i hope it's clear that the problem occured on Jonathans machine, not mine. > (There are wishlist bugs open asking for download resume, which makes a lot of > sense for large packages) oh well, i second that ! :) I often download kerneltrees. > > Are you proposing to compare a checksum every time a file is served? > > I think that's a lot of work for something that just isn't going to > > happen. Well, i can't imagine it's that much work. Other package software does, too. I think it also could be an additional security measurement. But as i said, i'm no programmer. > I don't want to have to involve the sysadmin. They could be not available Yes, that's a good point generally. But....think of the jobs :) |
From: Micha <ear...@gm...> - 2005-02-03 03:34:17
|
Jonathan Koren <jk...@cs...>: > Sysadmins have their reasons for doing things like this. It doesn't > matter if actually makes sense to someone else or not. I think you're completely right. But i have nothing else to do, and so :) i like to add that to me it makes sense to put an ap cache on its own partition anyway, with an adequate filesystem and setting it's features. And then it wouldn't be that hard to mount it atime. But making atime mandatory and suggest a seperate partition would mean no automatic install, and no matter how cool it is, it means 10 minus points in usability. So I accept that it's just asked too much. > By changing the mtime whenever a file is downloaded would cause that file > to backed up even though it hasn't been changed. Yes. I didn't think of that. But then, who backs up an apt cache ? It's perfectly redundant, and given that you always want the last security fix, rather useless. The only thing i do is occasionally burn the complete cache in case i need to install somewhere with small bandwith. |
From: Chris H. <ha...@de...> - 2005-02-02 22:14:52
|
On Wednesday 02 Feb 2005 21:46, Luis Matos wrote: > Chris Halls wrote: > >within a certain time (e.g. 2 months, settable by a parameter) could be > > i think you shoul do that trought the conf file. Defining the time, if > it is has a mirror or not. Sorry, that's what I meant by 'settable by a parameter'. > >It's not quite as flexible as your method, but would perhaps be easier > > from the user point of view, because there would be only 1 parameter to > > tweak instead of having to maintain a list of packages. > > For this case there is no need to have the packages installed. justa > make a txt list from dpkg -l > list.txt > and the apt-get install `cat list.txt` -d . I use that to have > semi-automated esktop instalations and quite popular between the newbies. I was thinking that needs more effort than setting a config file parameter. What happens when package names change, e.g. a library package number increases, or a new kernel package is introduced. Anyway, the ideal solution is probably to implement both methods :) Anay volunteers? Chris |
From: Chris H. <ha...@de...> - 2005-02-02 22:06:11
|
On Monday 31 Jan 2005 22:10, Jonathan Koren wrote: > Then don't use atime at all. Use a served-time that only apt-proxy knows > about. When it sends the file it updates it. When it goes to clean up > the cache, it checks for it, and fills it with the mtime if it doesn't > exist. If it can't get an mtime, it uses the current time. The mtime is not really all that useful. There are lots of packages that may be quite old, e.g. all of Woody, where it is more important to know whether clients are actively downloading the package rather than the time since it was created. Or perhaps you were thinking of setting the mtime to the time when ap wrote it, instead of the timestamp of the file itself? > For whatever reason, recycling worked one system, and not on the other. > If it worked on both systems, I wouldn't have even noticed it. But I did > notice it, and what I saw, I didn't like. My guess would be a problem with the database permissions bugs that took several attempts to fix. The uploads were only to unstable, and life in unstable is not guaranteed to be perfect I'm afraid. > The way I understand it, the database needs to be updated so that the > cleanup will work. This means you have three processes running: A server > that fetches files, stores files in the cache, and sends files to clients > while updating the file's last access time in a database. A cleaner that > periodically removes old files from the cache by comparing the current > time to each file's last access time that is stored in a database. > Finally, a recycler that periodically looks through the cache and makes > sure all files in the cache are in the database. > > There are two processes that search through the cache and compares each > file to the database. That's redundant. Furthermore, the recycler runs > constantly, and if all goes well finds nothing the vast majority of the > time. Where as the cleaner runs only periodically (or at least it > should). Finally, the recycler runs excruciatingly slowly. It shouldn't > take 16 hours to update that small of a database, or any database for that > matter. There aren't multiple processes - it is effectively co-operative multitasking using the twisted framework. I've not looked at that code closely enough to know all the details; you need to actually look at the code to find out what is going on. For example, the recycler reads just a little part of the cache and then sleeps for a second before moving on. I think that was supposed to help spread the load (since it is not easy to 'nice' IO in the same way as you can CPU time), but as you have noted it doesn't really notice very fast when large numbers of files are copied into the cache. However, since the recycling is just taking a note of the time the file entered the cache, it doesn't matter if it takes a while before apt-proxy notices. The database entry is not needed when serving the file, only when doing cache cleaning. > Now what I REALLY don't like is that an arcane command is needed to update > a database that already has an automatic process to keep the database > updated. That is beyond dumb. It implies that automatic process doesn't > work, or at least not well enough to be trusted; and if that's the case, > what's the point of having the automatic process in the first place? > Not only does apt-proxy-import redo the recycler's job, but it also > tries to the server's job by checking if the file to be copied is actually > new or not. And when it tries to do that job, it fails miserably because > for some reason it can't find backends that the server finds just > fine. apt-proxy-import was written to do a different job, and we're ending up trying to use a square peg to fit a round hole. a-p-i is designed to automatically find the right places for files in the mirror without assuming that the files to import are in the right hierarchy already. Say you have been running apt updates from a machine that does not use ap. You will end up with a directory full of .debs in /var/cache/apt/archives, and a-p-i was written to be able to take these files and import them into the right backend and subdirectory in the cache. This means a-p-i either needs to find the package listed in one of the Packages files, or it needs to guess based on pattern matching. It sounds like some of this went wrong in your case. I was not aware of it not working, and indeed it works well enough for me here. Perhaps you could provide some examples of the problem you are experiencing and detailed logs of a-p-i's behviour, thanks. > If apt-proxy-import was simply cp, then the recycler would eventually find > the new file and update the database. When the file is requested, the > server would check if it's new or not, and do the right thing accordingly, > apt-proxy-import is solving a problem that doesn't exist. Well, a problem you do not have in this case, because you have the correct directory layout already. > So yeah, I don't like apt-proxy-import at all. Fine, just don't use it and use cp. > Shouldn't the server process handle that automatically? Doesn't the > server process already handle that automatically? If it didn't download a > complete file (which it can detect by comparing the number of bytes > recieved with the number of bytes expected), then delete the partial file > and try again. (There are wishlist bugs open asking for download resume, which makes a lot of sense for large packages) > (This can trivally be extended to check that the server > didn't just receive any old bytes, but the correct bytes.) After n tries, > fail so the client can request the next file. Next time the client > requests the file, it won't be in the database, so the process starts all > over again. You should never cache a broken file. It would be helpful, though, to cache a partially downloaded file. But it's orthoginal to the database - the file can be downloaded to <filename>.partial and then renamed if the d/l is sucessfull > If you're talking about files getting damaged that are already on the > disk, I don't think that's going to happen very often, and if so, the disk > damage is probably far more extensive than just a couple of files limited > to the apt-proxy cache. If the damage is extensive, that's the job for a > disk recovery tool. I do agree that it's not likely to happen very often. The problem that needs to be solved is what happens if the file does happen to be damaged and clients request the file. With normal http sources, apt re-requests the file and you get a new copy. But with ap, it just serves the file again without realising that there is a problem, and the clients never get a new copy. So it is important for ap to have a way to know that it needs to reget a file from the backend, otherwise you get complaints from users of the cache who cannot correct the problem. > Are you proposing to compare a checksum every time a file is served? > I think that's a lot of work for something that just isn't going to > happen. If on the off chance a file in the cache did manage to become > corrupted, the sysadmin would check the logs see the error report (e.g. > "unexpected end of file") and simply delete the broken file from the > cache, which would cause apt-proxy to get a new copy of the file. If that > file is also corrupted, then something much more serious is wrong, and > apt-proxy couldn't possibly fix it. (i.e. the remote source file is > corrupted, or the disk is dying) I don't want to have to involve the sysadmin. They could be not available and ap has enough information to work out the problem for itself. Oh, and that assumes there is a competent sysadmin available who realises that this is a problem with the local ap and doesn't assume it is a problem with the backend server itself. Chris |
From: Luis M. <ga...@ot...> - 2005-02-02 21:46:46
|
Chris Halls wrote: > > >I was thinking of a similar feature, that is not quite as flexible but would >do the same thing but without needing to maintain a config file. > >The idea is to use the Packages files and file databases to decide when to >download new versions. Any packages that had been downloaded by a client >within a certain time (e.g. 2 months, settable by a parameter) could be >automatically updated if a new version appeared in the Packages file for the >backend. If no client requested the package anymore after a certain period, >ap would stop updating that package automatically. > > i think you shoul do that trought the conf file. Defining the time, if it is has a mirror or not. >It's not quite as flexible as your method, but would perhaps be easier from >the user point of view, because there would be only 1 parameter to tweak >instead of having to maintain a list of packages. > > For this case there is no need to have the packages installed. justa make a txt list from dpkg -l > list.txt and the apt-get install `cat list.txt` -d . I use that to have semi-automated esktop instalations and quite popular between the newbies. >But whichever version you prefer, if you have time to implement it please go >ahead and I'd gladly add it. > >Chris > > >------------------------------------------------------- >This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting >Tool for open source databases. Create drag-&-drop reports. Save time >by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. >Download a FREE copy at http://www.intelliview.com/go/osdn_nl >_______________________________________________ >Apt-proxy-users mailing list >Apt...@li... >https://lists.sourceforge.net/lists/listinfo/apt-proxy-users > > > > |
From: Chris H. <ha...@de...> - 2005-02-02 21:41:01
|
On Tuesday 01 Feb 2005 02:20, Micha wrote: > i have another issue that might be my very own wish, but maybe others find > it useful too. Perhaps you can give a short feedback if this is the case. > > It's the fact that my base station serves a dozen roaming customer laptops > which i update when the owners are here for service or maintainance or > whatever. I am aware that is not the standard situation ap2 is designed > for, but it's still a situation where ap2 prooves most useful. It's > importasnt to note that my bandwith has limits, and there are good and bad > times to suck much stuff. > > I try to keep my ap cache up to date completely, but to my knowledge this > means i have to keep the packages installed on the main box. Where most of > them naturally doesn't belong here. There also are some packages i just > like to have quickly available in case i'll decide (or need) to install, > like kernel sources, but again i don't need them installed yet. > > There's the 'apt' option to only download packages, without actually > install them and it's also a feature of aptitude (i don't know about the > other GUIs). But it's still a circumstance to always maintain the 'order' > list, and do the aptitude cache cleaning. > > I imagine there could be an optional configuration file with a list of > packages that ap2 should update always at a specified frequency. When there > is a user request, then this 'additional stock' downlaodeds should get a > lower priority than that user requested download. For more diverse needs, > this list could of course be autogenerated by a custom script, inserting > effectivly all installed packages that are not on 'hold', or whatever > selection you'd prefer. This 'additional stock' could be downloaded eg. by > night (with a cronjob) or whenever the best time would be, in terms of > bandwidth costs. > > This would be a way to manage the ap cache content independent of user > requests, to the extend of a real mirror. I was thinking of a similar feature, that is not quite as flexible but would do the same thing but without needing to maintain a config file. The idea is to use the Packages files and file databases to decide when to download new versions. Any packages that had been downloaded by a client within a certain time (e.g. 2 months, settable by a parameter) could be automatically updated if a new version appeared in the Packages file for the backend. If no client requested the package anymore after a certain period, ap would stop updating that package automatically. It's not quite as flexible as your method, but would perhaps be easier from the user point of view, because there would be only 1 parameter to tweak instead of having to maintain a list of packages. But whichever version you prefer, if you have time to implement it please go ahead and I'd gladly add it. Chris |
From: Jonathan K. <jk...@cs...> - 2005-02-01 20:53:02
|
On Tue, 1 Feb 2005, Micha wrote: > Why noatime mounts ? I guess (but i don't really know) as a try to > accelerate the access time. But do you really think on a modern HD this > could be a bottleneck, even when serving 50 machines ? So why not make > atime mandatory, and suggest the standard atime mount at installation > time, and there could affitionally be a check if no atomes can be found, > and an appropriate error message. Sysadmins have their reasons for doing things like this. It doesn't matter if actually makes sense to someone else or not. People really get mad when some package says, "Oh, by the way. You need to change how you're doing your job." I think it's better ap just supports the users' whims without question. [stuff about ditching atime in favor of mtime with touching to indicate last access] It's an interesting idea. I do like how it gets away from replicating fs stuff. Apt and the like rely on version numbers rather than mtimes, so that's not a problem. The main problem I see would be with backup scripts. They rely on mtime almost exclusively. By changing the mtime whenever a file is downloaded would cause that file to backed up even though it hasn't been changed. For a large and well utilized cache, that could result in lots gigs needless being backuped daily. That's not a good situation either. Basically, I think the db is an okay solution to a problem where there aren't any really good solutions. -- Jonathan Koren World domination? I'll leave that to the jk...@cs... religious nuts and Republicans, thank you. http://www.cs.siu.edu/~jkoren/ -- The Monarch, "Venture Brothers" |
From: Micha <ear...@gm...> - 2005-02-01 02:55:01
|
Hi, i have another issue that might be my very own wish, but maybe others find = it useful too. Perhaps you can give a short feedback if this is the case. It's the fact that my base station serves a dozen roaming customer laptops = which i update when the owners are here for service or maintainance or what= ever. I am aware that is not the standard situation ap2 is designed for, bu= t it's still a situation where ap2 prooves most useful. It's importasnt to note that my bandwith has limits, and there are good and= bad times to suck much stuff.=20 I try to keep my ap cache up to date completely, but to my knowledge this m= eans i have to keep the packages installed on the main box. Where most of t= hem naturally doesn't belong here. There also are some packages i just like to have quickly available in case = i'll decide (or need) to install, like kernel sources, but again i don't ne= ed them installed yet. There's the 'apt' option to only download packages, without actually instal= l them and it's also a feature of aptitude (i don't know about the other GU= Is). But it's still a circumstance to always maintain the 'order' list, and= do the aptitude cache cleaning.=20 I imagine there could be an optional configuration file with a list of pack= ages that ap2 should update always at a specified frequency.=20 When there is a user request, then this 'additional stock' downlaodeds shou= ld get a lower priority than that user requested download.=20 For more diverse needs, this list could of course be autogenerated by a cus= tom script, inserting effectivly all installed packages that are not on 'ho= ld', or whatever selection you'd prefer. This 'additional stock' could be downloaded eg. by night (with a cronjob) o= r whenever the best time would be, in terms of bandwidth costs. This would be a way to manage the ap cache content independent of user requ= ests, to the extend of a real mirror. =B0 /\/ |
From: Micha <ear...@gm...> - 2005-02-01 02:54:58
|
I'm sorry my question lead to some judging expressions here. OTOH i find the thread really constructive, and i'm glad to see no one's f= laming. I'm no programmer, but as my boxes' design somehow depends on apt-proxy, i = like to make some notes, too. > On Mon, 31 Jan 2005, Chris Halls wrote: > > The ideas are good but they do ignore the two use cases I mentioned tha= t were > > the primary reasons for the existence of the database (noatime mounts a= nd > > unwanted updating of atime by other programs). Why noatime mounts ? I guess (but i don't really know) as a try to accelera= te the access time. But do you really think on a modern HD this could be a = bottleneck, even when serving 50 machines ? So why not make atime mandatory, and suggest the standard atime mount at in= stallation time, and there could affitionally be a check if no atomes can b= e found, and an appropriate error message. I believe access time is somethi= ng a filesystem is designed for, why not leave the job to where it belongs.= It should be faster than all the database operations, anyway. Then there still would be the problem with 'foreign atime updating'. I imagine something like backup or indexing, for example the locate db. As far as i can see, they onlymodify atime, never mtime. So perhaps the ideal solution would be to ditch atime checking. My first idea is, use mtime where you used atime, and introduce a slight fi= le modification for the cases where ap needs mtime. ext3 and reiser support= user specific attributes, or access control lists, for example. But the pr= oblem to handle many different FS types would be clearly too much, and stil= l there's ext2. So you only can rely on a lowlevel common unix feature. I suggest working with ctime, the inode modification time.=20 I've found some notes in the manpage of 'ls' but i'm no specialist here. It= said permissions, or renaming, modify ctime but not the others. So perhaps= you can (touch and) use that to determine when a file was downloaded, and = touch mtime when a file is requested ? Again, i'm no programmer, and know little of filesystems. It only appears t= o me that managing file time stamps at all would be redundand, as the files= ystem itself should be the best 'database'. And _they_ do a lot of effort to check and ensure integrity. So why not leave the job to them ? Perhaps it means you've got more time for long-term things... btw. I didn't mean to abandon the whole database concept. But perhaps you c= an reserve the db for information that never can be managed by the FS, and = needs to be updated or requested rather extensivly. I don't know your roadm= ap, of course. Jonathan Koren <jk...@cs...>: > Are you proposing to compare a checksum every time a file is served ? Maybe they think of security checks against corruped or hacked files ?=20 greetings, mi. =B0 /\/ |
From: Jonathan K. <jk...@cs...> - 2005-01-31 22:14:04
|
On Mon, 31 Jan 2005, Chris Halls wrote: > The ideas are good but they do ignore the two use cases I mentioned that were > the primary reasons for the existence of the database (noatime mounts and > unwanted updating of atime by other programs). Then don't use atime at all. Use a served-time that only apt-proxy knows about. When it sends the file it updates it. When it goes to clean up the cache, it checks for it, and fills it with the mtime if it doesn't exist. If it can't get an mtime, it uses the current time. > implementation than the concept itself. If the recycling actually worked > properly then I'm not convinced there would still be a need to change the > concept. Or do you think that even with the recycling mechanism working, For whatever reason, recycling worked one system, and not on the other. If it worked on both systems, I wouldn't have even noticed it. But I did notice it, and what I saw, I didn't like. The way I understand it, the database needs to be updated so that the cleanup will work. This means you have three processes running: A server that fetches files, stores files in the cache, and sends files to clients while updating the file's last access time in a database. A cleaner that periodically removes old files from the cache by comparing the current time to each file's last access time that is stored in a database. Finally, a recycler that periodically looks through the cache and makes sure all files in the cache are in the database. There are two processes that search through the cache and compares each file to the database. That's redundant. Furthermore, the recycler runs constantly, and if all goes well finds nothing the vast majority of the time. Where as the cleaner runs only periodically (or at least it should). Finally, the recycler runs excruciatingly slowly. It shouldn't take 16 hours to update that small of a database, or any database for that matter. Now what I REALLY don't like is that an arcane command is needed to update a database that already has an automatic process to keep the database updated. That is beyond dumb. It implies that automatic process doesn't work, or at least not well enough to be trusted; and if that's the case, what's the point of having the automatic process in the first place? Not only does apt-proxy-import redo the recycler's job, but it also tries to the server's job by checking if the file to be copied is actually new or not. And when it tries to do that job, it fails miserably because for some reason it can't find backends that the server finds just fine. If apt-proxy-import was simply cp, then the recycler would eventually find the new file and update the database. When the file is requested, the server would check if it's new or not, and do the right thing accordingly, apt-proxy-import is solving a problem that doesn't exist. So yeah, I don't like apt-proxy-import at all. > that the database concept is wrong? I think the database is a necessary hack to get around not having atimes. Being a hack, if there is ever a chance to kill it, then it should be killed at the earliest opportune momement. > (The idea was to add more information to those databases in the future, > such as a better way ensuring that files in the cache are intact) Shouldn't the server process handle that automatically? Doesn't the server process already handle that automatically? If it didn't download a complete file (which it can detect by comparing the number of bytes recieved with the number of bytes expected), then delete the partial file and try again. (This can trivally be extended to check that the server didn't just receive any old bytes, but the correct bytes.) After n tries, fail so the client can request the next file. Next time the client requests the file, it won't be in the database, so the process starts all over again. You should never cache a broken file. If you're talking about files getting damaged that are already on the disk, I don't think that's going to happen very often, and if so, the disk damage is probably far more extensive than just a couple of files limited to the apt-proxy cache. If the damage is extensive, that's the job for a disk recovery tool. Are you proposing to compare a checksum every time a file is served? I think that's a lot of work for something that just isn't going to happen. If on the off chance a file in the cache did manage to become corrupted, the sysadmin would check the logs see the error report (e.g. "unexpected end of file") and simply delete the broken file from the cache, which would cause apt-proxy to get a new copy of the file. If that file is also corrupted, then something much more serious is wrong, and apt-proxy couldn't possibly fix it. (i.e. the remote source file is corrupted, or the disk is dying) So unless you can be more specific than "more information", I'm opposed to using the database anymore than a crutch to get around noatime. -- Jonathan Koren World domination? I'll leave that to the jk...@cs... religious nuts and Republicans, thank you. http://www.cs.siu.edu/~jkoren/ -- The Monarch, "Venture Brothers" |
From: Chris H. <ha...@de...> - 2005-01-31 20:33:25
|
On Monday 31 Jan 2005 19:55, Jonathan Koren wrote: > I don't think I was rude. If you took offense, I'm sorry. Thanks > I'm not saying > "You suck!". I am saying "It appears the design you have come up with is > unnecessarily complicated in this respect. It isn't the design of those who are working on the project. We just inherited it. > Furthermore, here is an > alternate design that fulfils what little understand of the requirements I > have, and eliminates the complications I am critical of, > results in what appears to be a more robust design." The ideas are good but they do ignore the two use cases I mentioned that were the primary reasons for the existence of the database (noatime mounts and unwanted updating of atime by other programs). > I didn't direct > criticism at anyone. There's a big difference between calling an idea > dumb, and calling someone dumb. They are two very seperate things. It was the tone of your reply to my mail that I didn't appreciate. The implemention of the idea is not mine, and IMO it is more the problem of the implementation than the concept itself. If the recycling actually worked properly then I'm not convinced there would still be a need to change the concept. Or do you think that even with the recycling mechanism working, that the database concept is wrong? (The idea was to add more information to those databases in the future, such as a better way ensuring that files in the cache are intact) > That said, just because someone spends their copious amounts of free time > writing software, doesn't make them infallible. It just means they have a > hobby. And means ranty's implementation can't be fixed as quick as we would like to. Chris |
From: Jonathan K. <jk...@cs...> - 2005-01-31 19:59:01
|
On Mon, 31 Jan 2005, Chris Halls wrote: > I think you have some good ideas here but it is unnecessary to be rude, > especially while Otavio & I are trying to fix the problems in v2 in our spare > time. I don't think I was rude. If you took offense, I'm sorry. I'm not saying "You suck!". I am saying "It appears the design you have come up with is unnecessarily complicated in this respect. Furthermore, here is an alternate design that fulfils what little understand of the requirements I have, and eliminates the complications I am critical of, which also results in what appears to be a more robust design." I didn't direct criticism at anyone. There's a big difference between calling an idea dumb, and calling someone dumb. They are two very seperate things. That said, just because someone spends their copious amounts of free time writing software, doesn't make them infallible. It just means they have a hobby. -- Jonathan Koren World domination? I'll leave that to the jk...@cs... religious nuts and Republicans, thank you. http://www.cs.siu.edu/~jkoren/ -- The Monarch, "Venture Brothers" |
From: Chris H. <ha...@de...> - 2005-01-31 18:11:02
|
On Monday 31 Jan 2005 15:58, Jonathan Koren wrote: > Personally, I think an external utility that pretty much just fills in a > database is kind of dumb. Please be respectful and curteous on this list. I was only trying to help you by suggesting the best way to do it I knew of. I did not design this part and the author died in a car crash last year. > I also don't understand why recycling is > needed. [...] I think you have some good ideas here but it is unnecessary to be rude, especially while Otavio & I are trying to fix the problems in v2 in our spare time. Thank you Chris |
From: Jonathan K. <jk...@cs...> - 2005-01-31 16:01:54
|
On Mon, 31 Jan 2005, Micha wrote: > But assumed i like to copy a 7 GB cache to another freshly installed > machine with the same OS (say debian sarge). And it would be much more > easy to cpio or tar the caceh dir over ethernet. Would it work ? What do > you think i'd have to consider ? It will cause recycling as the database is created/updated. Personally, I think an external utility that pretty much just fills in a database is kind of dumb. I also don't understand why recycling is needed. Yeah, I know what it does, I just don't understand why it needs to be a seperate process. It seems like all you have to do is have the code be something like: handle_request(file): if file doesn't exist: fetch_file_from_backend(file) set_atime(file, `date`) send_file(file) cleanup_cache(): foreach file in cache: atime = get_atime(file) if atime == None: atime = get_mtime(file) # mtime exists everywhere if (`date` - atime) > cleanuptime: unlink(file) delete_atime(file) This way the database is updated on demand. If the file has never been sent, then it gets deleted when its mtime (which is identical to its ctime) expires. Best of all it eliminates the need for arcane commands like apt-proxy-import, because the atime database just works. Of course, I'm still bitter that upon upgrading I went from a system that worked just fine, to a system that doesn't work nearly as well. Also, I don't like apt-proxy-import right now, because it doesn't work for me, because instead of just copying files and updating a database it's looking for backends it can't find for some reason. Right now I think it's trying to be too smart and do too many things. Don't bother about whether or not the file is current or not, just stick it in the cache, and let apt-proxy handle it. That's apt-proxy's job. -- Jonathan Koren World domination? I'll leave that to the jk...@cs... religious nuts and Republicans, thank you. http://www.cs.siu.edu/~jkoren/ -- The Monarch, "Venture Brothers" |
From: Micha <ear...@gm...> - 2005-01-31 09:11:46
|
> You should be using the utility written for this purpose, apt-proxy-import. But assumed i like to copy a 7 GB cache to another freshly installed machine with the same OS (say debian sarge). And it would be much more easy to cpio or tar the caceh dir over ethernet. Would it work ? What do you think i'd have to consider ? |
From: Micha <ear...@gm...> - 2005-01-28 23:58:10
|
> In the week since my original email, I managed to do something to > server that fixed this problem. I don't remember exactly what I did. It > was something minor like eliminating a stale lock file or something. This detail could ould be very important, though... > 2005/01/26 23:57 CST [-] [debug] Verifying database: /var/cache/apt-proxy/.apt-proxy/db/access.db > 2005/01/26 23:57 CST [-] [db] /var/cache/apt-proxy/.apt-proxy/db/access.db could not be opened, moved to /var/cache/apt-proxy/.apt-proxy/db/access.db.error Perhaps there's something with mount and filesystem features ? |
From: Jonathan K. <jk...@cs...> - 2005-01-27 08:19:55
|
On Tue, 25 Jan 2005, Chris Halls wrote: > You should be using the utility written for this purpose, apt-proxy-import. > If you can arrange to mount the server cache via some means e.g. nfs, you can > also save network traffic by getting apt-proxy-import to read the server tree > itself. That way, it will only copy files that it does not already have in > its own cache. The destination was completely out of date and would have needed to transfer the entire cache anyway so network utilization couldn't have been lessened in this case. I needed to resync the caches again, however this time I tried to use apt-proxy-import. It didn't work. apt-proxy-import refused to do anything saying the same thing for every file in the nfs mounted cache: apt-proxy-import -r -v -i /mnt/ 2005/01/27 00:27 CST [-] [import] considering: /mnt/debian/pool/main/p/perl-tk/perl-tk_800.025-2_i386.deb 2005/01/27 00:27 CST [-] [import] Not found, trying to guess 2005/01/27 00:27 CST [-] [import] perl-tk_800.025-2_i386.deb skipped - no suitable backend found I do know that the cache needs to be updated, and I did run `apt-get update` on both machines immediately prior to running apt-proxy-import. >>> The problem I'm having is that when apt-proxy starts on the server it >>> spends hours and hours recycling, and whenever apt-proxy restarts, the >>> entire process starts all over again. > > This will be caused by the problems with your database files. > >>> Furthermore, I believe that this >>> recycling is the cause of the "503 Service Unavailable" I receive when I >>> try and access the cache through apt. > > That's odd, I don't know what that problem is In the week since my original email, I managed to do something to server that fixed this problem. I don't remember exactly what I did. It was something minor like eliminating a stale lock file or something. > I'm not sure. There were some problems with this bit of code and it has been > recently changed. Which version of apt-proxy are you running? From my original email: >>> I'm running 1.9.24 on debian. > That indicates the database library could not open the databse, very strange. > Is it possible you copied that database from the server, and the server has > different apt-proxy/db library versions? The two caches do have different versions of libdb, but I'm not convinced that version incompatabilities account for the access error every time ap starts. According to the log, it appears ap fails to open the database, and as a result moves the broken database to .error and then creates new database from scratch. The problem I have appears to be that every newly created database is also broken in someway. I just cleared out .apt-proxy/db and restarted it. After 13 hours of recycling, the recycling stopped. I restarted apt-proxy, and recycling started immediately. 2005/01/26 23:57 CST [-] [debug] Verifying database: /var/cache/apt-proxy/.apt-proxy/db/access.db 2005/01/26 23:57 CST [-] [db] /var/cache/apt-proxy/.apt-proxy/db/access.db could not be opened, moved to /var/cache/apt-proxy/.apt-proxy/db/access.db.error 2005/01/26 23:57 CST [-] [db] Recreating /var/cache/apt-proxy/.apt-proxy/db/access.db 2005/01/26 23:57 CST [-] [debug] Opening database /var/cache/apt-proxy/.apt-proxy/db/access.db 2005/01/26 23:57 CST [-] [recycle] Adopting new file: /debian/.apt-proxy 2005/01/26 23:58 CST [-] [recycle] Adopting new file: /debian/pool/non-US/non-free/r/rsaref2/rsaref2_19940415-3_i386.deb > Can you run 'file *' in that directory please? bubbles:/var/cache/apt-proxy/.apt-proxy/db# file * access.db: Berkeley DB (Hash, version 7, native byte-order) access.db.error: Berkeley DB (Hash, version 7, native byte-order) packages.db: Berkeley DB (Hash, version 7, native byte-order) update.db: Berkeley DB (Hash, version 7, native byte-order) -- Jonathan Koren World domination? I'll leave that to the jk...@cs... religious nuts and Republicans, thank you. http://www.cs.siu.edu/~jkoren/ -- The Monarch, "Venture Brothers" |