From: Roger B. <ro...@ro...> - 2002-12-19 09:35:11
|
Thanks to some insight from Terrel, I have been looking at speeding up the RPM indexing. There were two things that a big effect. The first is telling the rpm command/library not to try and verify the various signatures/digests on the rpm. The second is using the rpm-python library. I did some tests on a 1.4GHz Athlon, 1GB RAM and indexing all three CDs of Redhat 8.0. With the current UML Builder code, it takes 6 minutes. [This code basically repeatedly calls the rpm command line binary and parses its output] Supply options to not check/digests and signatures cut the time in half to 3 minutes. Using the rpm-python library instead cut the time down to 6 seconds! This version does however use considerably more memory. For example, at one point it holds the entire filelisting for all RPMs in memory. This shouldn't be that big a deal since the rpm command line binary appears to do the same as far as I can tell, and your machine by definition needs enough extra RAM to run UML in the first place. I also have rpm-python based version that does not consume that extra memory. It takes one minute. I'll be doing another 1.40 release of UML Builder soon with this stuff in it. It is currently in CVS if anyone wants to take a look. Roger |