Re: [lc-devel] [RFC] LZO de/compression support - take 6
Status: Beta
Brought to you by:
nitin_sf
From: Nitin G. <nit...@gm...> - 2007-05-28 15:48:42
|
On 5/28/07, Adrian Bunk <bu...@st...> wrote: > On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote: > > Hi, > > > > Attached is tester code used for testing. > > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > > version for 'TinyLZO') > > > > Cheers, > > Nitin > > > > On 5/28/07, Nitin Gupta <nit...@gm...> wrote: > >> (Using tester program from Daniel) > >> > >> Following compares this kernel port ('take 6') vs original miniLZO code: > >> > >> 'TinyLZO' refers to this kernel port. > >> > >> 10000 run averages: > >> 'Tiny LZO': > >> Combined: 61.2223 usec > >> Compression: 41.8412 usec > >> Decompression: 19.3811 usec > >> 'miniLZO': > >> Combined: 66.0444 usec > >> Compression: 46.6323 usec > >> Decompression: 19.4121 usec > >> > >> Result: > >> Overall: TinyLZO is 7.3% faster > >> Compressor: TinyLZO is 10.2% faster > >> Decompressor: TinyLZO is 0.15% faster > > So your the compressor of your version runs 10.2% faster than the > original version. > > That's a huge difference. > > Why exactly is it that much faster? > > cu > Adrian I am not sure on how to account for this _big_ perf. gain but from benchmarks I see that whenever I remove unncessary casting from tight loops I get perf. gain of 1-2%. For e.g. open coding LZO_CHECK_MPOS_NON_DET macro with all unnecessary casting removed, gave perf. gain of ~2%. Similarly, I found many other places where such casting was unnecessary. These changes have been tested on x86, amd64, ppc. Testing of 'take 6' version is yet to be done - this will confirm that I didn't reintroduce some error. - Nitin |