From: Gordan B. <go...@bo...> - 2009-02-13 16:12:00
|
I'm pondering what could be done to speed up mkinitrd, so thought I'd share some thoughts. Apart from removing the unnecessary files from it (omitting .pyo/.pyc files (is this filter included in the current preview), omitting unused kernel modules (diet patch)), there are two things that I can see as making a big difference to the initrd build speed. 1) Extracting file lists from the RPM DB This involces invoking rpm -q for each package, which is slow (lots of process startup latency and churn and not much CPU used, most of the time is spent starting up and tearing down processes. If this could somehow be combined it might just yield a signifficant speed-up. I'll test this theory over the weekend and report back. The one problem with this approach is that there would be no sensible way to apply per-package filtering. 2) Compression speed 2.1) Using a parallel gzip (http://www.zlib.net/pigz/) compressor, which should scale pretty much linearly with the number of CPU cores. A (source) RPM seems to be available, but only for SuSE (http://rpm.pbone.net/index.php3/stat/4/idpl/11044884/com/pigz-2.1.4-5.1.x86_64.rpm.html), so until it is more common, it may have to be made available via the comoonics yum repository. 2.2) Using a decent compiler (Intel's ICC) to squeeze more performance out of the compressor. Intel do provide an optimized gzip library sample (http://www.intel.com/cd/software/products/asmo-na/eng/219967.htm) which according to the docs seems to also be multi-threaded (will have to double check that). My previous tests on Pentium III indicated that ICC built gzip is about 20% faster than the GCC built one. Since IPP includes a highly optimized gzip module, it should do even better, and still stack with the multi-processor scaling. The only problem I can see with 2.2) is that ICC is only free for non-commercial use (OSS is mentioned as an example, and IIRC MySQL used to distribute an ICC built version of their community DB), so that part is something for you guys at Atix to figure out. :) I'll look into this and post some performance results over the weekend, but in the meantime, has anyone got any thoughts on this? Any reason why this might be deemed a bad idea? Gordan |