Hi Rodrigo,
I had a careful look at VMM code (2.6.14.3) for last few days and it seems
to be somewhat complex. Properly understanding it will take more time than =
I
expected. I think a static compressed cache handling only swap-cache and
using the same 2-page cell structure will be nice "Ver 0.1". After this
initial implementation will be ready to try out all the new ideas we have.
Basically I need some experience with VMM code to really come up with
something we can say "Compressed Cache for 2.6 kernels". I have many new
ideas for ccache but cannot determine if they are crap or practical to
implement and can really help overall system performance.
After this initial implementation, I think we can gradually revamp it to
include adaptivity, handling page-cache pages, and other things of concern.
I have also been working on another ccache structure for some time now.
These are the main problems I see with 2-page cell structure.
-- Need to search for cell with total free space big enough to hold
compressed page.
-- Need to do compaction when a page is taken out.
-- Under heavy memory load it can be difficult to allocate two contiguous
pages.
-- Lock contention problems (haven't studied this well).
The idea is to let a compressed page be written in chunks in the free areas
as they occur in ccache.
The ideas and the algorithm are taken partly from-
1) the way linux handles in-flight block I/O - single buffer is scattered
across many pages in variable sized chunks. It can help in (similar)
implementation of ccache structure.
2) log-based filesystems - It searches for ideas to minimize disk seeks
during disk write and minimize fragmentation. We are not concerned about
disk seeks but ideas can help avoid traversing free lists and particularly
good are fragmentation avoidance methods.
3) buddy allocator - Merge together adjacent free blocks and maintain
separate lists according to 'order' of pages.
It will eliminate need for searching for free space when compressed page is
to be written to ccache, no need for compaction and allocates single page a=
t
a time.
Following describes the idea:
--> During page-in (taking page out of ccache):
-- Locate all the chunks associated with required page and decompress them.
-- Merge free chunks if found adjacent. Maintain a list of free chunks as
<struct page, offset, len> (possibly maintain different list according to
chunk sizes e.g. 'good'>=3D1/2*PAGE_SIZE, 'bad'>=3D1/4*PAGE_SIZE and the 'u=
gly'
for chunks < bad).
-- If a "good" chunk was just freed, set this as 'start' position for write
(writing compressed page in ccache). This will prevent searching free list
every time from beginning which will otherwise cause fragmentation problem
very severe.
--> During page-out (storing page into ccache):
-- Compress page and start writing out beginning with 'start' position (thi=
s
is set during page-in and is at start of ccache initially).
-- Beginning with chunk at 'start', take more free chunks (without
considering their size) from free list to write out compressed page.
-- To prevent "metadata madness" limit the no. of chunks per compressed pag=
e
(say, MAX_CHUNKS=3D4).
-- If no. of chunks required > MAX_CHUNKS, simply write out the last chunk
at 'tail' of ccache.
-- Fix-up the free list to reflect chunks used and space left over after th=
e
last chunk.
This can (i'm not sure) also reduce lock contentions as locks can now be
more fine grained.
Though it causes space overhead due to need to maintain info about all the
chunks (upto MAX_CHUNKS) associated with a compressed page, it should
significantly improve speed (reasons mentioned above).
Also, I have some more interesting papers that can help in ccache but as I
mentioned in beginning, my first objective is to get working a static
ccache, handling only swap-cache pages and using 2-page cell structure so
that I first have some experience with VMM code.
Happy New Year!!
Best Regards,
Nitin
|