linuxcompressed-devel Mailing List for Linux Compressed Cache (Page 4)
Status: Beta
Brought to you by:
nitin_sf
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(6) |
Nov
(1) |
Dec
(11) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(22) |
Feb
(11) |
Mar
(31) |
Apr
(19) |
May
(17) |
Jun
(9) |
Jul
(13) |
Aug
(1) |
Sep
(10) |
Oct
(4) |
Nov
(10) |
Dec
(4) |
2003 |
Jan
|
Feb
(8) |
Mar
|
Apr
(5) |
May
(39) |
Jun
(10) |
Jul
(2) |
Aug
(1) |
Sep
(1) |
Oct
(27) |
Nov
(1) |
Dec
(2) |
2004 |
Jan
|
Feb
(3) |
Mar
(1) |
Apr
|
May
|
Jun
(3) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
|
2005 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(9) |
Dec
(2) |
2006 |
Jan
(7) |
Feb
(4) |
Mar
(12) |
Apr
(16) |
May
(11) |
Jun
(48) |
Jul
(19) |
Aug
(16) |
Sep
(13) |
Oct
|
Nov
(8) |
Dec
(1) |
2007 |
Jan
(4) |
Feb
|
Mar
|
Apr
(3) |
May
(26) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
(7) |
Mar
(5) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Nitin G. <nit...@gm...> - 2006-08-09 06:36:46
|
Christopher Blizzard wrote: > Rik van Riel wrote: >> Marcelo isn't currently online due to an accident, but I talked to >> him on the phone today. He said he definately wants to use compressed >> swap on OLPC, so I think it would be good if you could work on >> stabilizing the current code for use on the OLPC board. >> >> Open source being what it is, I'm sure somebody will add the page >> cache stuff and the adaptive ccache bits. Especially if they have >> a stable base to work from :) >> > > We've talked on and off about allowing some limited swapping to the > flash under emergency circumstances. But no one has come along with a > concrete way to do that yet. > I think its already done. When compressed cache reaches its size limit (set by user), pages as sent to swap disks. But, if you meant handling wear leveling issues (and maybe others) for swapping over flash, then that is not yet done :) The usual swapping code is still used which aims to minimize disk seeks. -- Nitin |
From: Christopher B. <bli...@re...> - 2006-08-09 03:28:50
|
Rik van Riel wrote: > Christopher Blizzard wrote: >> Rik van Riel wrote: >>> Marcelo isn't currently online due to an accident, but I talked to >>> him on the phone today. He said he definately wants to use compressed >>> swap on OLPC, so I think it would be good if you could work on >>> stabilizing the current code for use on the OLPC board. >> >> We've talked on and off about allowing some limited swapping to the >> flash under emergency circumstances. But no one has come along with a >> concrete way to do that yet. > > Makes sense, but hopefully with Nitin's code we can just compress > less-used memory and avoid swapping :) > Oh, sure, sure. I was just saying "we're keeping our options open." :) --Chris |
From: Rik v. R. <ri...@re...> - 2006-08-09 03:07:29
|
Christopher Blizzard wrote: > Rik van Riel wrote: >> Marcelo isn't currently online due to an accident, but I talked to >> him on the phone today. He said he definately wants to use compressed >> swap on OLPC, so I think it would be good if you could work on >> stabilizing the current code for use on the OLPC board. > > We've talked on and off about allowing some limited swapping to the > flash under emergency circumstances. But no one has come along with a > concrete way to do that yet. Makes sense, but hopefully with Nitin's code we can just compress less-used memory and avoid swapping :) -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan |
From: Christopher B. <bli...@re...> - 2006-08-09 02:30:37
|
Rik van Riel wrote: > Marcelo isn't currently online due to an accident, but I talked to > him on the phone today. He said he definately wants to use compressed > swap on OLPC, so I think it would be good if you could work on > stabilizing the current code for use on the OLPC board. > > Open source being what it is, I'm sure somebody will add the page > cache stuff and the adaptive ccache bits. Especially if they have > a stable base to work from :) > We've talked on and off about allowing some limited swapping to the flash under emergency circumstances. But no one has come along with a concrete way to do that yet. --Chris |
From: Rik v. R. <ri...@re...> - 2006-08-08 15:41:40
|
Nitin Gupta wrote: > So, here's what I aim by end of SoC: > (Riel, can you please suggest here?) > * Either > Start work on to add support for page cache pages and heuristic for > adaptive ccache resize. > * Or > Work to stabilize this much for anonymous pages, code cleanups, testing > on OLPC board, (maybe) document all this stuff. Marcelo isn't currently online due to an accident, but I talked to him on the phone today. He said he definately wants to use compressed swap on OLPC, so I think it would be good if you could work on stabilizing the current code for use on the OLPC board. Open source being what it is, I'm sure somebody will add the page cache stuff and the adaptive ccache bits. Especially if they have a stable base to work from :) -- All Rights Reversed |
From: Nitin G. <nit...@gm...> - 2006-08-06 21:01:24
|
The patch is against 2.6.18-rc3-git7 which is (I think) same as 2.6.18-rc4. So, please apply it against -rc4 not -rc3. (little) testing was done on x86 _only_ (taken RAM=128M, max_anon_cc_size=1000 pages -- started KDE, launched apps like KMail, firefox, kpaint etc.) On 8/7/06, Nitin Gupta <nit...@gm...> wrote: > ------------ > Project: Compressed Caching for Linux > Git (web): http://dev.laptop.org/git.do?p=projects/linux-mm-cc > Git (git): git://dev.laptop.org/projects/linux-mm-cc > ------------ > > Hi! > > This is weekly work summary [10/12] or I thought, better call it 'Announce' ;) > > Attached is patch for compressed caching for anon pages (2.6.18-rc3) - > 'alpha-002' :) > > Finally, 'better' locking scheme in find_get_page() is working together > with compression structure. This equates to 'nearly' complete compressed > caching feature for anonymous pages. But, I haven't yet made any benchmarks > though! > > Many of the suggestions from previous post for cache-alpha-1 have been taken to > this patch. I will be glad if you can take time to look at this patch too for > _any_ of suggestions you have. There are also some known glitches which I > couldn't solve now. > > Known problems: > -- Pages once decompressed are not put back on LRU (active?). Doing so is > causing system freeze and I couldn't get to the cause yet. > > -- LZO is still giving problems. System freezes just after some pages are > compressed. Biggest problem is that it requires mammoth 64K buffer to compress a > single page. > > -- How to 'migrate' incompressible pages to swap disk? This is not an issue for > OLPC as they do not have a swap. > > So, here's what I aim by end of SoC: > (Riel, can you please suggest here?) > * Either > Start work on to add support for page cache pages and heuristic for > adaptive ccache resize. > * Or > Work to stabilize this much for anonymous pages, code cleanups, testing > on OLPC board, (maybe) document all this stuff. > > > Cheers, > Nitin > > > |
From: Nitin G. <nit...@gm...> - 2006-08-06 20:41:25
|
------------ Project: Compressed Caching for Linux Git (web): http://dev.laptop.org/git.do?p=projects/linux-mm-cc Git (git): git://dev.laptop.org/projects/linux-mm-cc ------------ Hi! This is weekly work summary [10/12] or I thought, better call it 'Announce' ;) Attached is patch for compressed caching for anon pages (2.6.18-rc3) - 'alpha-002' :) Finally, 'better' locking scheme in find_get_page() is working together with compression structure. This equates to 'nearly' complete compressed caching feature for anonymous pages. But, I haven't yet made any benchmarks though! Many of the suggestions from previous post for cache-alpha-1 have been taken to this patch. I will be glad if you can take time to look at this patch too for _any_ of suggestions you have. There are also some known glitches which I couldn't solve now. Known problems: -- Pages once decompressed are not put back on LRU (active?). Doing so is causing system freeze and I couldn't get to the cause yet. -- LZO is still giving problems. System freezes just after some pages are compressed. Biggest problem is that it requires mammoth 64K buffer to compress a single page. -- How to 'migrate' incompressible pages to swap disk? This is not an issue for OLPC as they do not have a swap. So, here's what I aim by end of SoC: (Riel, can you please suggest here?) * Either Start work on to add support for page cache pages and heuristic for adaptive ccache resize. * Or Work to stabilize this much for anonymous pages, code cleanups, testing on OLPC board, (maybe) document all this stuff. Cheers, Nitin |
From: Zach B. <za...@za...> - 2006-07-28 21:36:36
|
> Again __set_bit()/__clear_bit() take unsigned long while chunk->size is > unsigned short. Can it cause problem if I cast this chunk->size to > unsigned long and pass to them? _Yes_. That's what I said in the first mail and why I brought it up to begin with! Maybe architectures (like i386!) implement it with assembly that will write to the entire long when setting a given bit. - z |
From: Nitin G. <nit...@gm...> - 2006-07-28 20:34:28
|
Hi All, This is the patch for compressed caching feature against 2.6.18-rc2 :) (Its taken from git branch 'simple_lock_structure') * What is does -- compress anon (only anon not page-cache pages now) pages cyclically using WKdm and WK4x4. -- Store these pages in compression structure as desc. on Wiki. Do not expect any performance improvements now...I found it to be just stable enough after integrating the compression structure. In fact, it will give perf. heart attack to your system ;) because of simplified locking scheme it uses in lookup functions (I'm working on this). Please let me know if it crashes on you system. I will keep stabilizing it in 'simple_lock_structure' branch while I'm trying to get better locking scheme working in 'better_locking' branch. Finally, I'll merge these two, to get fully functional ccache for anon pages. (I hope to have support for page cache pages shortly after that). Cheers, Nitin Gupta |
From: Zach B. <za...@za...> - 2006-07-28 19:46:50
|
> I need to keep 'struct chunk' as small as possible. This 'test_bit' > usage is > just super blind copying of page-flags.h :) > > I think this should do (atomic guarantee not required): Oh, if you don't need atomicity why not just use a bitfield? unsigned short free:1, size:15; or whatever. Though I think you'll find that most archs are padding it out to be at least an int anyway to avoid unaligned accesses of neighbouring members. > I couldn't find equivalent error codes with meaning of CC_EEXPAND and > CC_EUSED > so defined them here. s/CC_ENOMEM/ENOMEM So have it return > 0 for the expand case? > chunk_head is given to page_cache_get() before we know it's 'struct > page' or > 'struct chunk_head' so need to keep name and its position same as for > struct > page. Ugh, I missed that detail. That sounds very dangerous and I don't think it will be allowed. You should find a better way. > Why the 'struct page' has this '_' for count and mapcount? Because they're not supposed to be used directly, only by helper functions. > Yes its temporary and I'm not sure how to handle this failed alloc! > If it fails, we lose the page since its copy is not stored in swap disk. The swap path can't return an error at this point? I'd have to look into the paths here to offer real help :/. > Having trouble understanding what it is....Will read more on it. Think of it as a compiler that only parses the code to emit warnings. - z |
From: Arnd B. <arn...@de...> - 2006-07-28 19:41:37
|
On Thursday 27 July 2006 07:40, Nitin Gupta wrote: > > > +#define ChunkFree(chunk) \ > > > + test_bit(CHUNK_free, (unsigned long *)(&(chunk)->size)) Your code has lots of CamelCase or MIXED_case in it, don't do that, even if you find other people doing the same. The two styles that are generally accepted are ALL_CAPS and all_lower_case. > > This is not safe. By handing the *_bit() interface an unsigned long > > pointer to 'unsigned short size' you're risking corrupting the chunk > > members that are near the size member. Make 'size' an unsigned long and > > get rid of these casts. > > > > I need to keep 'struct chunk' as small as possible. This 'test_bit' usage is > just super blind copying of page-flags.h :) > > I think this should do (atomic guarantee not required): > > #define ChunkFree(chunk) \ > (!!(chunk->size & (1 << CHUNK_FREE))) > #define SetChunkFree(chunk) \ > (chunk->size |= (1 << CHUNK_FREE)) > #define ClearChunkFree(chunk) \ > (chunk->size &= ~(1 << CHUNK_FREE)) > Or use __set_bit()/__clear_bit() to get the non-atomic versions of these. test_bit() is always atomic and fast. > > > +/* error codes */ > > > +#define CC_ENOMEM 1 > > > +/* compression algo actually increased size! */ > > > +#define CC_EEXPAND 2 > > > +/* someone accessed page when we > > > + * were about to compress it! */ > > > +#define CC_EUSED 3 > > > > Please don't introduce your own error code namespace. Typical practice > > is to use -ERRNO to indicate errors, 0 for success, and +ve for > > indicating work done. Using a convention like that will make your code > > more digestible to other kernel maintainers. > > > > I couldn't find equivalent error codes with meaning of CC_EEXPAND and CC_EUSED > so defined them here. s/CC_ENOMEM/ENOMEM s/CC_EUSED/EBUSY/ ? > > > > > +int anon_cc_started=0; > > > +unsigned long max_anon_cc_size=0, max_fs_backed_cc_size=0; > > > +unsigned long anon_cc_size=0, fs_backed_cc_size=0; /* current size */ > > > > You don't need = 0 here, they'll be put in the BSS and zeroed if you > > don't define an initial value. > > > > Ok...But how do you avoid compiler warnings? Though thats not an issue but > they look ugly when compiling :) What warning do you get from this? > > > +int cc_writepage(struct page *page) > > > +{ > > > + int ret = -CC_ENOMEM; > > > + //swp_entry_t entry; > > > + int comp_size, algo_idx; > > > + struct chunk_head *ch=0; > > > + struct page *tmp_page=0, *new_page=0, **slot; > > > > Set pointers to NULL. Have you run sparse against this code? See > > Documentation/sparse.txt in the kernel tree. > > > > Having trouble understanding what it is....Will read more on it. Even better, don't initialize any local variables in the declaration. Normally, you have an assignment before the first use anyway. Putting an initialization to NULL (or 0) in the declaration will prevent gcc from warning about broken usage. One more thing about comments. The common style is to use either /* single line comment */ or /* * multi * line * comment */ but not /* multi * line * comment */ For disabling code fragments, use #if 0 dont_call(); #endif instead of /* dont_call(); */ For the initial code, it also helps to run it through 'indent -kr -i8', in order to get indentation right. Arnd <>< |
From: Andreas M. <an...@us...> - 2006-07-28 05:13:12
|
Hi, On Thu, Jul 27, 2006 at 11:10:03AM +0530, Nitin Gupta wrote: > > It's unfortunate that we came up with a name that conflicts with the > > popular compiler cache tool. Can we come up with a unique name? > > > > No problem. __CCACHE_H, _CCACHE_H_, _COMP_CACHE_H_ ... :) Better make that _COMPR_CACHE_H_, since it's not a "computer cache". ;) I reviewed remaining parts, but no comments from me. Andreas Mohr |
From: Nitin G. <nit...@gm...> - 2006-07-27 22:42:31
|
Arnd Bergmann wrote: > On Thursday 27 July 2006 07:40, Nitin Gupta wrote: > >>>> +#define ChunkFree(chunk) \ >>>> + test_bit(CHUNK_free, (unsigned long *)(&(chunk)->size)) > > Your code has lots of CamelCase or MIXED_case in it, don't do that, > even if you find other people doing the same. > > The two styles that are generally accepted are ALL_CAPS and all_lower_case. > Done s/CHUNK_free/CHUNK_FREE and like. 'Chunk' operations are used at my places where they are done on 'struct page' too which uses CamelCase. So, it may look uglier to have ALL_CAPS for chunk ops. like: if (TestPageLocked(page)) SET_CHUNK_LOCKED(page); instead of... if (TestPageLocked(page)) SetChunkLocked(page); >>> This is not safe. By handing the *_bit() interface an unsigned long >>> pointer to 'unsigned short size' you're risking corrupting the chunk >>> members that are near the size member. Make 'size' an unsigned long and >>> get rid of these casts. >>> >> I need to keep 'struct chunk' as small as possible. This 'test_bit' usage is >> just super blind copying of page-flags.h :) >> >> I think this should do (atomic guarantee not required): >> >> #define ChunkFree(chunk) \ >> (!!(chunk->size & (1 << CHUNK_FREE))) >> #define SetChunkFree(chunk) \ >> (chunk->size |= (1 << CHUNK_FREE)) >> #define ClearChunkFree(chunk) \ >> (chunk->size &= ~(1 << CHUNK_FREE)) >> > > Or use __set_bit()/__clear_bit() to get the non-atomic versions of these. > test_bit() is always atomic and fast. > Again __set_bit()/__clear_bit() take unsigned long while chunk->size is unsigned short. Can it cause problem if I cast this chunk->size to unsigned long and pass to them? I think they just require start address to count bit from and hence shouldn't hurt. >>>> +/* error codes */ >>>> +#define CC_ENOMEM 1 >>>> +/* compression algo actually increased size! */ >>>> +#define CC_EEXPAND 2 >>>> +/* someone accessed page when we >>>> + * were about to compress it! */ >>>> +#define CC_EUSED 3 >>> Please don't introduce your own error code namespace. Typical practice >>> is to use -ERRNO to indicate errors, 0 for success, and +ve for >>> indicating work done. Using a convention like that will make your code >>> more digestible to other kernel maintainers. >>> >> I couldn't find equivalent error codes with meaning of CC_EEXPAND and CC_EUSED >> so defined them here. s/CC_ENOMEM/ENOMEM > > s/CC_EUSED/EBUSY/ ? > Looks fine. I'll get some replacement for CC_EEXPAND too and get rid of these :) >>>> +int anon_cc_started=0; >>>> +unsigned long max_anon_cc_size=0, max_fs_backed_cc_size=0; >>>> +unsigned long anon_cc_size=0, fs_backed_cc_size=0; /* current size */ >>> You don't need = 0 here, they'll be put in the BSS and zeroed if you >>> don't define an initial value. >>> >> Ok...But how do you avoid compiler warnings? Though thats not an issue but >> they look ugly when compiling :) > > What warning do you get from this? > None. These are global variables, didn't see that carefully. I get 'variable can be used uninitialized' for local vars. (when they indeed can be used uninitialized). >>>> +int cc_writepage(struct page *page) >>>> +{ >>>> + int ret = -CC_ENOMEM; >>>> + //swp_entry_t entry; >>>> + int comp_size, algo_idx; >>>> + struct chunk_head *ch=0; >>>> + struct page *tmp_page=0, *new_page=0, **slot; >>> Set pointers to NULL. Have you run sparse against this code? See >>> Documentation/sparse.txt in the kernel tree. >>> >> Having trouble understanding what it is....Will read more on it. > > Even better, don't initialize any local variables in the declaration. > Normally, you have an assignment before the first use anyway. Almost every time, case is this: some_func() { void *a, *b; a = alloc() if (!a) goto out: b = alloc(); ... out: if (a) free(a) if (b) free(b) ... } Here you (correctly) get 'possible uninitialized usage for b' warning. That's why I initialize those local variables. > One more thing about comments. The common style is to > use either > > /* single line comment */ > > or > > /* > * multi > * line > * comment > */ > > but not > > /* multi > * line > * comment */ > > For disabling code fragments, use > > #if 0 > dont_call(); > #endif > > instead of > > /* dont_call(); */ > /* * Ok, then * I'll do this :) */ > For the initial code, it also helps to run it through 'indent -kr -i8', > in order to get indentation right. > > Arnd <>< Regards, Nitin |
From: Nitin G. <nit...@gm...> - 2006-07-27 19:49:48
|
Andreas Mohr wrote: > I reviewed remaining parts, but no comments from me. That's because code seemed all right to you or was it just unthinkable mess? ;) -- Nitin |
From: Nitin G. <nit...@gm...> - 2006-07-27 19:46:19
|
Zach Brown wrote: >> I need to keep 'struct chunk' as small as possible. This 'test_bit' >> usage is >> just super blind copying of page-flags.h :) >> >> I think this should do (atomic guarantee not required): > > Oh, if you don't need atomicity why not just use a bitfield? > > unsigned short free:1, > size:15; > > or whatever. Though I think you'll find that most archs are padding it > out to be at least an int anyway to avoid unaligned accesses of > neighbouring members. Never used 'bitfield' before. I'll surely look into this. > >> I couldn't find equivalent error codes with meaning of CC_EEXPAND and >> CC_EUSED >> so defined them here. s/CC_ENOMEM/ENOMEM > > So have it return > 0 for the expand case? > I think, its better to have a consistent interface: -ve is error always, +ve, 0 is success. >> chunk_head is given to page_cache_get() before we know it's 'struct >> page' or >> 'struct chunk_head' so need to keep name and its position same as for >> struct >> page. > > Ugh, I missed that detail. That sounds very dangerous and I don't think > it will be allowed. You should find a better way. > Why do you think its 'dangerous'? In very early patches I used a 'struct page' in place of 'struct chunk_head'. This would have been 'safe' but that isn't good as sizeof(struct page) is just too big -- metadata size overhead will explode. Regards, Nitin |
From: Nitin G. <nit...@gm...> - 2006-07-27 19:35:13
|
Zach Brown wrote: >> Again __set_bit()/__clear_bit() take unsigned long while chunk->size is >> unsigned short. Can it cause problem if I cast this chunk->size to >> unsigned long and pass to them? > > _Yes_. That's what I said in the first mail and why I brought it up to > begin with! Maybe architectures (like i386!) implement it with assembly > that will write to the entire long when setting a given bit. > > - z > Ok, then I will do with this (as posted earlier): #define ChunkFree(chunk) \ (!!(chunk->size & (1 << CHUNK_FREE))) #define SetChunkFree(chunk) \ (chunk->size |= (1 << CHUNK_FREE)) #define ClearChunkFree(chunk) \ (chunk->size &= ~(1 << CHUNK_FREE)) -- Nitin |
From: Arnd B. <ar...@ar...> - 2006-07-27 11:53:54
|
On Thursday 27 July 2006 09:29, Nitin Gupta wrote: > some_func() > { > =A0 =A0 void *a, *b; > =A0 =A0 a =3D alloc() > =A0 =A0 if (!a) goto out: > =A0 =A0 b =3D alloc(); > =A0 =A0 ... > out: > =A0 =A0 if (a) free(a) > =A0 =A0 if (b) free(b) > =A0 =A0 ... > } >=20 > Here you (correctly) get 'possible uninitialized usage for b' warning.=20 > That's why I initialize those local variables. Well, I'd prefer to write this as=20 some_func() { void *a, *b; int ret; ret =3D -ENOMEM; a =3D alloc(); if (!a) goto out_a; b =3D alloc(); if (!b) goto out_b; ... ret =3D 0; free(b); =20 out_b: free(a); out_a: return ret; } Note: - statement after if() on following line - no variables initialized on declaration (ret could be) - only clean up what you created. - kfree(NULL); is ok Arnd <>< |
From: Nitin G. <nit...@gm...> - 2006-07-27 06:33:26
|
On 7/26/06, Zach Brown <za...@za...> wrote: > Nitin Gupta wrote: First, Thanks for taking time to review this patch! :) > > > > This is the patch for compressed caching feature against 2.6.18-rc2 :) > > Has anyone reviewed this with you yet? No. Not yet :/ > > > -EXTRAVERSION = -rc2 > > +EXTRAVERSION = -rc2-cc > > You can avoid polluting the diff with this naming change by putting > '-cc' in ./localversion > Ok. > > +++ b/include/linux/ccache.h > > @@ -0,0 +1,97 @@ > > +#ifndef _CCACHE_H > > +#define _CCACHE_H > > It's unfortunate that we came up with a name that conflicts with the > popular compiler cache tool. Can we come up with a unique name? > No problem. __CCACHE_H, _CCACHE_H_, _COMP_CACHE_H_ ... :) > > +#define ChunkFree(chunk) \ > > + test_bit(CHUNK_free, (unsigned long *)(&(chunk)->size)) > > This is not safe. By handing the *_bit() interface an unsigned long > pointer to 'unsigned short size' you're risking corrupting the chunk > members that are near the size member. Make 'size' an unsigned long and > get rid of these casts. > I need to keep 'struct chunk' as small as possible. This 'test_bit' usage is just super blind copying of page-flags.h :) I think this should do (atomic guarantee not required): #define ChunkFree(chunk) \ (!!(chunk->size & (1 << CHUNK_FREE))) #define SetChunkFree(chunk) \ (chunk->size |= (1 << CHUNK_FREE)) #define ClearChunkFree(chunk) \ (chunk->size &= ~(1 << CHUNK_FREE)) > > +/* error codes */ > > +#define CC_ENOMEM 1 > > +/* compression algo actually increased size! */ > > +#define CC_EEXPAND 2 > > +/* someone accessed page when we > > + * were about to compress it! */ > > +#define CC_EUSED 3 > > Please don't introduce your own error code namespace. Typical practice > is to use -ERRNO to indicate errors, 0 for success, and +ve for > indicating work done. Using a convention like that will make your code > more digestible to other kernel maintainers. > I couldn't find equivalent error codes with meaning of CC_EEXPAND and CC_EUSED so defined them here. s/CC_ENOMEM/ENOMEM > > + atomic_t _count; /* usage count; free this struct > > + * when count is 0 */ > > Lose the '_'. > chunk_head is given to page_cache_get() before we know it's 'struct page' or 'struct chunk_head' so need to keep name and its position same as for struct page. Why the 'struct page' has this '_' for count and mapcount? > Heck, maybe just remove it until it is actually used :). > I'm _now_ using it when improving locking in find_get_page() and friends. > > - }, > > - { > > + }, > > + { > > Don't mix changes like this in with the rest of the patch. They just > add background noise. Ok :) > > > +int anon_cc_started=0; > > +unsigned long max_anon_cc_size=0, max_fs_backed_cc_size=0; > > +unsigned long anon_cc_size=0, fs_backed_cc_size=0; /* current size */ > > You don't need = 0 here, they'll be put in the BSS and zeroed if you > don't define an initial value. > Ok...But how do you avoid compiler warnings? Though thats not an issue but they look ugly when compiling :) > > +const unsigned long ccache_size_limit = MAX_SWAP_OFFSET - 1; > > + > > +static struct list_head pages_head, mcl_head, lru_head; > > If you defined these with LIST_HEAD() then they would already be > initialized and you wouldn't need INIT_LIST_HEAD() in init_ccache(). > I'll do this. > > +static int WKdm_compress_wrapper (unsigned char *src, unsigned long src_len, > > + unsigned char *dest, unsigned long *dest_len) > > +{ > > + //pr_info("%s called.\n", __FUNCTION__); > > + *dest_len=WKdm_compress((WK_word*)src, (WK_word*)dest, 1024); > > + return 0; > > I guess propagating these return codes will be fixed up eventually. You mean return *dest_len ? I'll also remove src_len arg since its always PAGE_SIZE. > White space around '=', please: > > *dest_len = WK.. > > > + if (ret==LZO_E_OK) pr_info("LZO compression done!\n"); > > + else pr_info("LZO compression failed!\n"); > > Use newlines after if/else throughout the code, please: > > if (foo) > doit; > else > something; > Ok :) > > +repeat: > > + spin_lock(&ccache_lock); > > + while (free_head) { > > + //pr_info("in get_enough_chunks loop: %d\n", tst++); > > + chunk = free_head; > > + free_head = chunk->next; > > + ClearChunkFree(chunk); > > + spin_unlock(&ccache_lock); > > + if (ChunkMerged(chunk)) { > > + pr_info("merged chunk found: size=%d\n", > > + ChunkSize(chunk)); > > + kfree(chunk); > > + continue; > > The ccache_lock needs to be reacquired before continuing so that it > doesn't double-unlock. You should perform your tests with lockdep > enabled in the kernel config :). > Great catch! I'll learn about lockdep :) > > +int init_ccache(void) > > +{ > > + spin_lock(&ccache_lock); > > + INIT_LIST_HEAD(&pages_head); > > + INIT_LIST_HEAD(&lru_head); > > + INIT_LIST_HEAD(&mcl_head); > > + free_head = NULL; > > + spin_unlock(&ccache_lock); > > Hmmm. What are these locks protecting? If they're racing with users of > the lists then they're racing with users who will see uninitialized > lists :/. > I'll define these with LIST_HEAD() as you said. These were protected since init_ccache() was called _after_ vswap was activated. But now its called before vswap init, so these locks were not required anyway :) > > + /* collect compressed page from chunks */ > > + comp_page = alloc_page(GFP_ATOMIC); > > + if (!comp_page) { > > + pr_info("cc_readpage: comp_page alloc failed!!!\n"); > > + BUG(); > > + } > > GFP_ATOMIC allocations are expected to fail under load. I assume this > is just a temporary BUG() :). Yes its temporary and I'm not sure how to handle this failed alloc! If it fails, we lose the page since its copy is not stored in swap disk. Also, lookup functions (I think) cannot sleep so GFP_KERNEL (WAIT | IO | FS) can't be used either. Which flags should be used here for alloc? > > +int cc_writepage(struct page *page) > > +{ > > + int ret = -CC_ENOMEM; > > + //swp_entry_t entry; > > + int comp_size, algo_idx; > > + struct chunk_head *ch=0; > > + struct page *tmp_page=0, *new_page=0, **slot; > > Set pointers to NULL. Have you run sparse against this code? See > Documentation/sparse.txt in the kernel tree. > Having trouble understanding what it is....Will read more on it. > > + struct address_space *mapping=page_mapping(page); > > + > > + //ch = kmalloc(sizeof(struct chunk_head), GFP_KERNEL); > > + //if (!ch) goto out; > > + tmp_page = alloc_pages(GFP_KERNEL, 1); /* page may expand */ > > Do you mean to allocate two pages here? Yes. Since compressed data might be bigger that original data, we need to pass it extra space (2 pages) when compressing single page. (If compressed size >= PAGE_SIZE it is just kept in natural form or send to swap). > You don't seem to use the > second one and only free the first. > ah..that's a mistake: - __free_page(tmp_page) + __free_pages(tmp_page, 1) > OK, that's probably enough for now. Sorry I don't have time to dive > into the real mechanics of the patch. I hope this helped. > Thanks again for reviewing the code. This was really helpful :) Regards, Nitin |
From: Zach B. <za...@za...> - 2006-07-26 17:32:16
|
Nitin Gupta wrote: > Hi All, > > This is the patch for compressed caching feature against 2.6.18-rc2 :) Has anyone reviewed this with you yet? I'll share the comments that come to mind while lightly reading the patch. I won't go into too much depth. > -EXTRAVERSION = -rc2 > +EXTRAVERSION = -rc2-cc You can avoid polluting the diff with this naming change by putting '-cc' in ./localversion > +++ b/include/linux/ccache.h > @@ -0,0 +1,97 @@ > +#ifndef _CCACHE_H > +#define _CCACHE_H It's unfortunate that we came up with a name that conflicts with the popular compiler cache tool. Can we come up with a unique name? > +/* chunk flags: only 3 MSBs available [13-15] */ > +#define CHUNK_free 15 > +#define CHUNK_merged 14 The general style is to use all caps for constants. The page-flags.h constructs are a bit of an anomaly and shouldn't be duplicated :) > +#define ChunkFree(chunk) \ > + test_bit(CHUNK_free, (unsigned long *)(&(chunk)->size)) This is not safe. By handing the *_bit() interface an unsigned long pointer to 'unsigned short size' you're risking corrupting the chunk members that are near the size member. Make 'size' an unsigned long and get rid of these casts. Really, I'd just get rid of these wrappers entirely and use the native *_bit() interface. > +/* error codes */ > +#define CC_ENOMEM 1 > +/* compression algo actually increased size! */ > +#define CC_EEXPAND 2 > +/* someone accessed page when we > + * were about to compress it! */ > +#define CC_EUSED 3 Please don't introduce your own error code namespace. Typical practice is to use -ERRNO to indicate errors, 0 for success, and +ve for indicating work done. Using a convention like that will make your code more digestible to other kernel maintainers. > + atomic_t _count; /* usage count; free this struct > + * when count is 0 */ Lose the '_'. Heck, maybe just remove it until it is actually used :). > - }, > - { > + }, > + { Don't mix changes like this in with the rest of the patch. They just add background noise. > +int anon_cc_started=0; > +unsigned long max_anon_cc_size=0, max_fs_backed_cc_size=0; > +unsigned long anon_cc_size=0, fs_backed_cc_size=0; /* current size */ You don't need = 0 here, they'll be put in the BSS and zeroed if you don't define an initial value. > +const unsigned long ccache_size_limit = MAX_SWAP_OFFSET - 1; > + > +static struct list_head pages_head, mcl_head, lru_head; If you defined these with LIST_HEAD() then they would already be initialized and you wouldn't need INIT_LIST_HEAD() in init_ccache(). > + //pr_info("==== flags1=0x%08lx\n", ch->flags); If you used pr_debug() instead you could at least enable and disable the output on a per-file bases by defining DEBUG. That might save you from having to comment them in and out all the time. > +static int WKdm_compress_wrapper (unsigned char *src, unsigned long src_len, > + unsigned char *dest, unsigned long *dest_len) > +{ > + //pr_info("%s called.\n", __FUNCTION__); > + *dest_len=WKdm_compress((WK_word*)src, (WK_word*)dest, 1024); > + return 0; I guess propagating these return codes will be fixed up eventually. White space around '=', please: *dest_len = WK.. > + if (ret==LZO_E_OK) pr_info("LZO compression done!\n"); > + else pr_info("LZO compression failed!\n"); Use newlines after if/else throughout the code, please: if (foo) doit; else something; > +repeat: > + spin_lock(&ccache_lock); > + while (free_head) { > + //pr_info("in get_enough_chunks loop: %d\n", tst++); > + chunk = free_head; > + free_head = chunk->next; > + ClearChunkFree(chunk); > + spin_unlock(&ccache_lock); > + if (ChunkMerged(chunk)) { > + pr_info("merged chunk found: size=%d\n", > + ChunkSize(chunk)); > + kfree(chunk); > + continue; The ccache_lock needs to be reacquired before continuing so that it doesn't double-unlock. You should perform your tests with lockdep enabled in the kernel config :). > + if (ret) goto out; > + goto repeat; > +out: This sort of thing is a warning sign that the loop is getting too weird. Hopefully some more natural constructs can be found to simplify things. > +int init_ccache(void) > +{ > + spin_lock(&ccache_lock); > + INIT_LIST_HEAD(&pages_head); > + INIT_LIST_HEAD(&lru_head); > + INIT_LIST_HEAD(&mcl_head); > + free_head = NULL; > + spin_unlock(&ccache_lock); Hmmm. What are these locks protecting? If they're racing with users of the lists then they're racing with users who will see uninitialized lists :/. > +struct page* cc_readpage(struct chunk_head *ch) > +{ > + int ret = -CC_ENOMEM, algo_idx; > + unsigned int comp_size=0; > + struct page *decomp_page, *comp_page, *tmp_page; > + void *comp_data; > + struct chunk *chunk, *tmp; > + unsigned long flags; 'unsigned long flags' on the stack in the kernel usually means that 'flags' is going to be an argument to locking constructs that save irq state flags. This particular 'flags' only seems to be a temporary to store a trivial mask so you might as well remove it and avoid surprising people. > + /* collect compressed page from chunks */ > + comp_page = alloc_page(GFP_ATOMIC); > + if (!comp_page) { > + pr_info("cc_readpage: comp_page alloc failed!!!\n"); > + BUG(); > + } GFP_ATOMIC allocations are expected to fail under load. I assume this is just a temporary BUG() :). > + spin_lock(&ccache_lock); > + list_del(&ch->lru); > + spin_unlock(&ccache_lock); You almost always want to use list_del_init() so that you don't leave stale pointers in the list_head that is being removed. > +int cc_writepage(struct page *page) > +{ > + int ret = -CC_ENOMEM; > + //swp_entry_t entry; > + int comp_size, algo_idx; > + struct chunk_head *ch=0; > + struct page *tmp_page=0, *new_page=0, **slot; Set pointers to NULL. Have you run sparse against this code? See Documentation/sparse.txt in the kernel tree. > + struct address_space *mapping=page_mapping(page); > + > + //ch = kmalloc(sizeof(struct chunk_head), GFP_KERNEL); > + //if (!ch) goto out; > + tmp_page = alloc_pages(GFP_KERNEL, 1); /* page may expand */ Do you mean to allocate two pages here? You don't seem to use the second one and only free the first. OK, that's probably enough for now. Sorry I don't have time to dive into the real mechanics of the patch. I hope this helped. - z |
From: Nitin G. <nit...@gm...> - 2006-07-07 13:10:31
|
Hi All, Just finished with implementation for compression structure (as desc. on Wiki). Following is a short description, demo, Notes and TODO. (Code is at CompressedCaching/Code). -- The storage begins as a single page. As you add pages to it, it expands till it reaches its max limit. As you take out pages, it shrinks, freeing up pages when it has no chunks left. -- Adjacent free chunks are merged together. -- Each page can be compressed using different algo. Again, interface is via proc /proc/storage-test/{readpage, writepage, show_structure} - writepage: write a page on this to compress and store it in ccache. - readpage: write 'id' (desc. later) of page you want. - show_structure: read to show current snapshot of ccache storage. Here's a short demo: [storage-test]# insmod storage-test.ko [storage-test]# cat /proc/storage-test/show_structure Page: 1 Chunk: 1, Size: 4096, Free * begin as single page (one chunk of size PAGE_SIZE) * Now add a page to it: [storage-test]# dd if=/lib/libpthread.so.0 of=/proc/storage-test/writepage bs=4096 count=1 [storage-test]# cat /proc/storage-test/show_structure Page: 1 Chunk: 1, Size: 1020, Busy, [id: 0: WKdm] Chunk: 2, Size: 3076, Free * 'break' initial chunk and store compressed page. page is given 'id'=0 and compressed using WKdm. * Now, fill it with more data: [storage-test]# a=1 [storage-test]# while [ "$a" -le 15 ]; do dd if=/lib/libpthread.so.0 of=/proc/storage-test/writepage bs=4096 count=1; let "a+=1"; done [storage-test]# cat /proc/storage-test/show_structure Page: 1 Chunk: 1, Size: 1020, Busy, [id: 0: WKdm] Chunk: 2, Size: 972, Busy, [id: 1: WK4x4] Chunk: 3, Size: 1566, Busy, [id: 2: LZO] Chunk: 4, Size: 538, Busy, [id: 3: WKdm] Page: 2 Chunk: 5, Size: 482, Busy, [id: 3: WKdm] Chunk: 6, Size: 972, Busy, [id: 4: WK4x4] Chunk: 7, Size: 1566, Busy, [id: 5: LZO] Chunk: 8, Size: 1020, Busy, [id: 6: WKdm] Chunk: 9, Size: 56, Busy, [id: 7: WK4x4] Page: 3 Chunk: 10, Size: 916, Busy, [id: 7: WK4x4] Chunk: 11, Size: 1566, Busy, [id: 8: LZO] Chunk: 12, Size: 1020, Busy, [id: 9: WKdm] Chunk: 13, Size: 594, Busy, [id: 10: WK4x4] Page: 4 Chunk: 14, Size: 378, Busy, [id: 10: WK4x4] Chunk: 15, Size: 3718, Free * ccache expands to add more compressed pages. * algos are cyclically selected for now -- code to guess which algo should be used to compress page goes in guess_algo() [ccache.c] * a compressed page can take more than one chunk. Chunks with same 'id' belong to same compressed page. * It currently contians 11 compressed pages. It didn't store 15 more pages as said since max ccache size is set to 4 (uncompressed) pages. You can change the max size by changing MAX_CC_SIZE #defined in ccache.h -- in real there will be sysctl for this. * Now, take out some pages('id'=3): [storage-test]# echo "3" > /proc/storage-test/readpage [storage-test]# cat /proc/storage-test/show_structure Page: 1 Chunk: 1, Size: 1020, Busy, [id: 0: WKdm] Chunk: 2, Size: 972, Busy, [id: 1: WK4x4] Chunk: 3, Size: 1566, Busy, [id: 2: LZO] Chunk: 4, Size: 538, Free <--------- Page: 2 Chunk: 5, Size: 482, Free <--------- Chunk: 6, Size: 972, Busy, [id: 4: WK4x4] Chunk: 7, Size: 1566, Busy, [id: 5: LZO] Chunk: 8, Size: 1020, Busy, [id: 6: WKdm] Chunk: 9, Size: 56, Busy, [id: 7: WK4x4] Page: 3 Chunk: 10, Size: 916, Busy, [id: 7: WK4x4] Chunk: 11, Size: 1566, Busy, [id: 8: LZO] Chunk: 12, Size: 1020, Busy, [id: 9: WKdm] Chunk: 13, Size: 594, Busy, [id: 10: WK4x4] Page: 4 Chunk: 14, Size: 378, Busy, [id: 10: WK4x4] Chunk: 15, Size: 3718, Free * these free chunks are not merged since they are in different pages. Chunks never cross page boundary. * Now, take out another page: [storage-test]# echo "2" > /proc/storage-test/readpage [storage-test]# cat /proc/storage-test/show_structure Page: 1 Chunk: 1, Size: 1020, Busy, [id: 0: WKdm] Chunk: 2, Size: 972, Busy, [id: 1: WK4x4] Chunk: 3, Size: 2104, Free <--------- Page: 2 Chunk: 4, Size: 482, Free Chunk: 6, Size: 972, Busy, [id: 4: WK4x4] Chunk: 7, Size: 1566, Busy, [id: 5: LZO] Chunk: 8, Size: 1020, Busy, [id: 6: WKdm] Chunk: 9, Size: 56, Busy, [id: 7: WK4x4] Page: 3 Chunk: 10, Size: 916, Busy, [id: 7: WK4x4] Chunk: 11, Size: 1566, Busy, [id: 8: LZO] Chunk: 12, Size: 1020, Busy, [id: 9: WKdm] Chunk: 13, Size: 594, Busy, [id: 10: WK4x4] Page: 4 Chunk: 14, Size: 378, Busy, [id: 10: WK4x4] Chunk: 15, Size: 3718, Free * adjacent chunks in page-1 are merged. Now a single free chunk of size 2104B. * If you free page 'id'=1 now, then chunk 1 & 3 can't be merged -- chunk-2 in b/w. * Now free another chunk: [storage-test]# echo "10" > /proc/storage-test/readpage [storage-test]# cat /proc/storage-test/show_structure Page: 1 Chunk: 1, Size: 1020, Busy, [id: 0: WKdm] Chunk: 2, Size: 972, Busy, [id: 1: WK4x4] Chunk: 3, Size: 2104, Free Page: 2 Chunk: 4, Size: 482, Free Chunk: 5, Size: 972, Busy, [id: 4: WK4x4] Chunk: 6, Size: 1566, Busy, [id: 5: LZO] Chunk: 7, Size: 1020, Busy, [id: 6: WKdm] Chunk: 8, Size: 56, Busy, [id: 7: WK4x4] Page: 3 Chunk: 9, Size: 916, Busy, [id: 7: WK4x4] Chunk: 10, Size: 1566, Busy, [id: 8: LZO] Chunk: 11, Size: 1020, Busy, [id: 9: WKdm] Chunk: 12, Size: 594, Free * Page 'id'=10 was read, chunks 13, 14 were freed. Chunks 14, 15 as now adjacent and in same page so they were merged. This gave a chunk that spans entire page so the page was freed (along with related data sturcts). * NOTES -- Incompressible pages (compressed size >= PAGE_SIZE) are never stored. -- No locking has been done - will do it when intergrating with main ccaching code. -- Shrinking not done - although pages are freed are chunks are taken out. But 'voluntary' shrinking has problems. It requires (mix of) two things: 1. To free a page, move all chunks it has to other pages and free it. 2. Move pages to swap disk (as compressed or decompress then swap?) I've done them on paper but not in code. For (2) different things are done for anon, clean page-cache and dirty page-cache pages. Its bit of mess but seems feasible :) -- struct chunk's 'master chunk list' should be singly linked. Again have this on paper but code used doubly linked list. This will save 4-bytes (32bit sys) per chunk! (and we can have many chunks per compressed page). * TODO -- Where are the stats?? show stats, in particular: 1. total space taken by metadata 2. avg. no. of chunks per chunk_head 3. time to compress, decompress, read/write comp page from/to ccache. -- Implement shrinking (part 1). -- Test it...crash it...fix it...crash..fix... :) Cheers, Nitin Gupta |
From: <ac...@sa...> - 2006-07-02 08:38:56
|
Nitin Gupta wrote: > Hi Sannes, > > Asbj=F8rn Sannes wrote: >> Nitin Gupta wrote: >>> >>>>> On 6/28/06, Anderson Briglia <bri...@gm...> >>>>> wrote: >>>>>>>> I have a doubt related to chunk operations: >>>>>>>> >>>>>>>> - I guess you will use the virtual swap as the >>>>>>>> compressed >>>>> cache for >>>>>>>> anon pages, right? If yes, how will you swapin/swapout >>>>>>>> chunks? >>>>> If a >>>>>>>> single chunk points to a portion of a physical page, >>>>>>>> and the swapin/swapout operations handle entire pages >>>>>>>> of data. >>>>>>> Yes, vswap is for anon pages. I think you are mistaken >>>>> somewhere...I'm not using >>>>>>> bio structs or anything to swapin/out pages from vswap. >>>>>> Ok. Will be possible to transfer pages from vswap to a >>>>>> 'real' swap >>>>> area? Yes, but 'procedure' will be same as for transferring a >>>>> page from a swap to any other swap: change it swp_entry_t to >>>>> have 'type' for other swap and it will get written to this >>>>> second swap. >>>> What will be the factor that determines when/which page will be >>>> migrated from vswap to real swap? >>>> >>> vswap is full. Maybe others but I haven't thought of them yet :P >>> >> >> What should be considered here is how do we actually "migrate" a >> page like this? Because this isn't done very often normally (how >> often do you add/deactive swap areas) it is my understanding that >> this is an expensive process, which is why we can't really do it, a >> proposal is to instead of having a "vswap" area, that we allocate >> other physical swap areas in advance and do the interception of >> find_get_page & friends as is done in the git tree now. This way, >> when we need to kick out pages it is a very straightforward >> process, just put it in the pre-allocated swap slot. Of course this >> wastes swap space (probably disk) and doesn't work with no swap >> area at all, so maybe a mix where you can add vswap in some other >> way for no-swap area (where we never kick out pages anyways) purely >> as an swap entry id excercise. >> > > The only expensive step in this "migration" from vswap to real swap > disk is scan_swap_map() (apart from actual disk write which has to be > done anyway), which gives you a free swap slot. So, the expense is > to run this scan_swap_map() twice -- once for vswap then for real > swap when migrating pages. > I'm not sure how that is going to work, perhaps I'm missing the point, scan_swap_map() is for allocating free slots of swap, which is fine. What will be expensive is finding all those page table entries pointing to your vswap page to be updated. Keep in mind I'm talking about the case where you have a swap area already and the heuristic wants to move a page from the compressed cache to the disk (say kicking out the least recently used page). > This thing can be improved by fact that swap entry value does not > determine (unlike for real swaps) where page goes in ccache storage. > So, all we need is a unique no. every time we call scan_swap_map. For > this, we can implement a custom scan_swap_map for vswap (and let > usual one run for real swap). This can be as simple as a stack of > free entries and will give us a unique swp_entry_t value instantly. > This is already being done in that storage-test module. Yes, this will take care of the case with no real swap (which was my weak point). > This will mitigate the cost of exec two scan_swap_map and your worry > ;) My worry was try_to_unuse() which visits each and every process going through their memory allocations. Mvh, Asbj=F8rn Sannes |
From: Nitin G. <nit...@gm...> - 2006-07-02 07:22:34
|
Hi Riel, Rik van Riel wrote: > On Sat, 1 Jul 2006, Asbjørn Sannes wrote: > >> What should be considered here is how do we actually "migrate" a page >> like this? Because this isn't done very often normally (how often do you >> add/deactive swap areas) it is my understanding that this is an >> expensive process, which is why we can't really do it, a proposal is to >> instead of having a "vswap" area, > > For systems that actually have a disk, it is probably best > to implement compressed caching for swap cache pages. > > Vswap is more useful for systems without swap, like the > OLPC laptops... Vswap is there only to support ccaching for swap cache pages. It has nothing to do for ccaching page cache pages. For page cache pages, page->index tells us where to hook in corres. mapping tree and for swap cache pages, swp_entry_t (obtained from scan_swap_map()) tells us where to hook in swapper space tree. Vswap is a mechanism to maintain 'symmetry' of existing reclaim code for anon pages. That way I can reuse maximum of existing work :) I think, same thing can be used for systems with disk (provided they want ccaching). Its not just a workaround for cases where there's no swap disks. So, ccaching work can be summarized as: pretend to write to disk (swap for anon, fs disk for page cache) when swapping out and pretend that page is in page/swap cache when lookup is performed. All the work is in getting this 'asymmetry' working :) Weekly work report to follow. :) Cheers, Nitin Gupta |
From: Nitin G. <nit...@gm...> - 2006-07-02 06:52:43
|
Hi Sannes, Asbjørn Sannes wrote: > Nitin Gupta wrote: >> >>>> On 6/28/06, Anderson Briglia <bri...@gm...> wrote: >>>>>>> I have a doubt related to chunk operations: >>>>>>> >>>>>>> - I guess you will use the virtual swap as the compressed >>>> cache for >>>>>>> anon pages, right? If yes, how will you swapin/swapout chunks? >>>> If a >>>>>>> single chunk points to a portion of a physical page, and the >>>>>>> swapin/swapout operations handle entire pages of data. >>>>>> Yes, vswap is for anon pages. I think you are mistaken >>>> somewhere...I'm not using >>>>>> bio structs or anything to swapin/out pages from vswap. >>>>> Ok. Will be possible to transfer pages from vswap to a 'real' swap >>>> area? >>>> Yes, but 'procedure' will be same as for transferring a page from a >>>> swap to any other swap: change it swp_entry_t to have 'type' for other >>>> swap and it will get written to this second swap. >>> What will be the factor that determines when/which page will be >>> migrated from vswap to real swap? >>> >> vswap is full. Maybe others but I haven't thought of them yet :P >> > > What should be considered here is how do we actually "migrate" a page > like this? Because this isn't done very often normally (how often do you > add/deactive swap areas) it is my understanding that this is an > expensive process, which is why we can't really do it, a proposal is to > instead of having a "vswap" area, that we allocate other physical swap > areas in advance and do the interception of find_get_page & friends as > is done in the git tree now. This way, when we need to kick out pages it > is a very straightforward process, just put it in the pre-allocated swap > slot. Of course this wastes swap space (probably disk) and doesn't work > with no swap area at all, so maybe a mix where you can add vswap in some > other way for no-swap area (where we never kick out pages anyways) > purely as an swap entry id excercise. > The only expensive step in this "migration" from vswap to real swap disk is scan_swap_map() (apart from actual disk write which has to be done anyway), which gives you a free swap slot. So, the expense is to run this scan_swap_map() twice -- once for vswap then for real swap when migrating pages. This thing can be improved by fact that swap entry value does not determine (unlike for real swaps) where page goes in ccache storage. So, all we need is a unique no. every time we call scan_swap_map. For this, we can implement a custom scan_swap_map for vswap (and let usual one run for real swap). This can be as simple as a stack of free entries and will give us a unique swp_entry_t value instantly. This is already being done in that storage-test module. This will mitigate the cost of exec two scan_swap_map and your worry ;) Cheers, Nitin Gupta |
From: Rik v. R. <ri...@re...> - 2006-07-02 01:45:37
|
On Sat, 1 Jul 2006, Asbj=F8rn Sannes wrote: > What should be considered here is how do we actually "migrate" a page > like this? Because this isn't done very often normally (how often do yo= u > add/deactive swap areas) it is my understanding that this is an > expensive process, which is why we can't really do it, a proposal is to > instead of having a "vswap" area, For systems that actually have a disk, it is probably best to implement compressed caching for swap cache pages. Vswap is more useful for systems without swap, like the OLPC laptops... Lets not confuse the two. --=20 All Rights Reversed |
From: <ac...@sa...> - 2006-07-01 20:12:08
|
Nitin Gupta wrote: > Hi Anderson, > >> >>> On 6/28/06, Anderson Briglia <bri...@gm...> wrote: >>> > > > I have a doubt related to chunk operations: >>> > > > >>> > > > - I guess you will use the virtual swap as the compressed >>> cache for >>> > > > anon pages, right? If yes, how will you swapin/swapout chunks? >>> If a >>> > > > single chunk points to a portion of a physical page, and the >>> > > > swapin/swapout operations handle entire pages of data. >>> > > >>> > > Yes, vswap is for anon pages. I think you are mistaken >>> somewhere...I'm not using >>> > > bio structs or anything to swapin/out pages from vswap. >>> > >>> > Ok. Will be possible to transfer pages from vswap to a 'real' swap >>> area? >>> > >>> >>> Yes, but 'procedure' will be same as for transferring a page from a >>> swap to any other swap: change it swp_entry_t to have 'type' for oth= er >>> swap and it will get written to this second swap. >> >> What will be the factor that determines when/which page will be >> migrated from vswap to real swap? >> > > vswap is full. Maybe others but I haven't thought of them yet :P > What should be considered here is how do we actually "migrate" a page like this? Because this isn't done very often normally (how often do you add/deactive swap areas) it is my understanding that this is an expensive process, which is why we can't really do it, a proposal is to instead of having a "vswap" area, that we allocate other physical swap areas in advance and do the interception of find_get_page & friends as is done in the git tree now. This way, when we need to kick out pages it is a very straightforward process, just put it in the pre-allocated swap slot. Of course this wastes swap space (probably disk) and doesn't work with no swap area at all, so maybe a mix where you can add vswap in some other way for no-swap area (where we never kick out pages anyways) purely as an swap entry id excercise. Greetings, Asbj=F8rn Sannes |