[lc-devel] Initial Compressed Cache structure

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I've been working on Compressed Cache structure for last few days. Followin=
g
is the brain dump of all the details required for ccache structure
implementation. This doc is a *headache* to follow (poorly written, no
diagrams included) -- but I'll later sketch them if required for any part.

I intend to start its implementation by next week, so please post any
doubts, suggestion, ideas you have :)

This is basically same as the structure described in that 'Initial Design'
doc but now I have thought it in more detail suitable for implementation.

First, the goals of 'good' ccache structure:

-- Locating a page in ccache should be fast
-- Adding/Taking a page to/from ccache should be fast (apart from
compression/decompression time)
-- Minimize fragmentation
-- Avoid double lookups i.e. single lookup of page cache should give
information if page is in compressed cache or not and also tell location of
the page within ccache if it is compressed.
-- Minimize metadata overhead
-- Page allocation requests should be 0 order (0 order allocation are faste=
r
than higher order and almost never fail)
-- Shrinking/Expanding ccache should be fast

(Currently, focus is on handling page cache (file-backed) pages only. Since
I have doubts over swap cache handling, I'll take it later. Compressed cach=
e
for Page Cache and Swap Cache will be separately maintained but they both
will have same ccache structure.)

Now following gives description of the structure which, I think, is 'nearly
good'. In end there is a list of problems that I can currently see with thi=
s
structure.

1. Overview

(Please see page cache mgmt. using radix trees and about 'tags' - UTLK 3rd
ed. is great). Also, I'll assume that you've read that 'initial design' doc
so I don't have to retype it here :) Foll. gives details I left there:

The ccache will have 'chunked' structure (please see 'initial design' doc
for its overview) -- i.e. each compressed page can be stored in several
different blocks of memory that may not be physically contiguous.

For every page that is decided to be compressed, its node in page cache
radix tree will be replaced by 'struct chunk_head *' instead of a page
descriptor pointer (struct page *).

'chunk_head' points to list of chunks storing parts of this compressed page=
:

struct chunk_head {
    unsigned int nr_chunks;
    unsigned int private;    // stores flags, usage count
    struct chunk *chunk_list;
}

'struct chunk' needs to store staring location of chunk and its size:

struct chunk {
    void *start_addr;
    unsigned int size; // MSB for 'used' flag
    struct list_head next;
    struct list_head next_chunk;
}

All the chunks in ccache form a doubly linked list using 'next_chunk'.
'next' is for connecting 'related' chunks separately like free chunk list,
chunks belonging to same compressed page.

2. Cache Operations

* Extracting a page from ccache:

System looks in page cache (radix tree of a particular inode), finds that
PAGECACHE_TAG_COMPRESSED tag is set. So, the node doesn't have a (struct
page *) but a chunk_head pointer (chunk_head *). Now, chunk_head->chunk_lis=
t
gives address of first 'struct chunk'. This gives start_addr and size of th=
e
compressed page within this chunk. Remaining chunks can be reached using
chunk->next.

Now, 'chunks' assocaiated with page just extracted are freed. Checks if
neighboring chunks are free (these can be reached using
chunk->next_chunk->{prev, next}). If neighbouring chunks are also free
(their MSB in chunk->size is cleared) they are merged into a single chunk.
Finally, extraneous 'struct chunks' are freed and the ultimate 'struct
chunk' obtained after merging adjacent chunks is added to a 'free list' (al=
l
free chunks are linked together -- that's free list).

* Inserting a page in ccache:

I. Compress the page and store in chunks taken from 'free list'.
II. Update the radix tree entry corresponding to this page to now point to
'chunk_head' used for this page.
III. Also update chunk->start_addr, chunk->size of the last chunk according
to space left there.

* Shrinking ccache:

A heuristic will be developed (not yet thought out) that will tell if
compressed cache is really worth the RAM space it is occupying. Accordingly
ccache will be shrunk or expanded.

When shrinking, see pages belonging to which inode's mapping are mostly
being freed (using simple hooks in shrink_list()-vmscan.c). This shows that
a particular inode's (file's) pages are not being used.
So, when shrinking, search the radix tree for compressed pages (search base=
d
on PAGECACHE_TAG_COMPRESSED tag) and free all the associated chunks. Since =
a
page can only be freed when all the chunks contained within are freed,
relocate the remaining chunks in the page to free chunks in other pages and
finally free the page.

* Expanding ccache:

Simply allocate a new page and add it to free list.

3. Notes

* This chunked structure basically reduces (hopefully) fragmentation to a
large extent (I've no data to support this) since it allows a single
compressed page to be stored scattered over many chunks.

* Inserting a page is fast since storage blocks(chunks) are simply taken
from free list (and allocating more pages if required). This does not
involve touching any other live data within ccache.

* Expanding ccache is piece of cake!

* Shrinking ccache is bit of a mess.

4. Problems

* Metadata size problem: For e.g., a compressed page taking 3 chunks
requires sizeof(chunk_head) + 3*sizeof(chunk) amount of metadata (On a
32-bit system this is 36 bytes!!)

* When extracting a compressed page, data is read from several chunks which
can be on separate physical pages. This total destruction of locality of
reference can seriously affect performance.

It's a bad idea that cannot be changed.        -- Unknown
    I'll be glad to have any suggestions, ideas you have :)

Cheers,
Nitin