Re: [lc-devel] WKdm Problem

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hello Scott Kaplan,

Scott Kaplan wrote:
> Nitin,
>
> I'm the `K' in WK, so I might be some kind of help.  (Then again, I
> might not!)
>
> WKdm was written only to be research-grade:  It worked well enough for
> experimental uses.  So, I'm not surprised that you're having a bit more
> difficulty with it.  Can you give us any more information about the
> errors?  Do the mangled pages have `minor' errors (just a few bytes that
> are incorrect), or are there large chunks of the page that are
> incorrect?  Is it possible for the compression/decompression to be
> interrupted?  It was not written to support that.
>
> Nitin Gupta wrote:
>
>   
I just missed the fact that WKdm is currently hard-coded to work only 
for input of 1024 words -- its working perfectly alright for 4K input. 
Currently I'm just testing it in user-space and have not yet started 
porting it to kernel space. One particular thing I'm trying to improve 
is to minimize extra memory it requires during compression. Presently, 
it stores tags and other bits one per byte (in case of partial or exact 
match) and later packs them together. I think, this packed output can be 
directly generated without this intermediate step -- this should greatly 
reduce its memory footprint.
>> WKdm is fastest with nice compression ratios so it'll be 
>> nice to have it working instead of WK4x4
>>     
>
> I will note, in case you care, that WK4x4 could be much faster if it
> were re-written from the WKdm code-base.  WKdm had some hand-optimizing
> (at the C level) that WK4x4 did not enjoy.  It's possible that WK4x4
> could give its better compression ratios with a (de)compression time
> that is close to WKdm's.
>
>   
Does WK4x4 gives better compression (than WKdm) for textual data also 
(since I'm currently working on compressing only page-cache data)?  
Also, I'll be working to try some optimizations for both WKdm and WK4x4 
in terms of speed, compression and memory overhead when adapting them to 
kernel space. I think better compression with slightly slower algorithm 
will be better than lower compression but somewhat faster algorithm.
> Rodrigo de Castro wrote:
>   
>> Remember that WKdm is very nice with swap cache data, but not 
>> with general page cache data.
>>     
>
> Right.  The WK algorithms were written with VM-based (primarily heap)
> pages in mind.  Something like LZO is likely to do better with file
> system pages.
>
> Scott
>
>   
Can you please suggest any other alternates to LZO for compressing 
file-system pages? Basically, the problem with LZO is that it's not well 
documented and it's implementation is cluttered to be very 
corss-platform, cross-OS, cross-compiler-versions, workarounds for 
specific archs, OSs, compiler versions etc. and is highly optimized 
which makes it difficult to understand. So, although it may be made to 
work in kernel mode (Rodrigo's cc-patch already does this very well!) 
but it's difficult to adapt it for compressed cache requirements and 
understand its memory requirements.

In particular, I'm looking at SquashFS(a compressed file-system) which 
uses 'zlib' for compression and is very fast and stable (it's widely 
used for LiveCDs). Also, it will give the benefit of having 
de/compression algorithms already in kernel space specially suited for 
file-system data and also zlib is well documented. But since SquashFS is 
not worried about compression times which is done 'offline', it may have 
to be worked out there.

Best Regards,
Nitin