|
From: Konstantin S. <kon...@gm...> - 2008-01-18 08:28:28
|
>> Anyway, I'll try the 64-bit patch I've tried both helgrinds (original one with 32-bit SVals and the one with 64-bit SVals). The test program natively works about 4 seconds. 'top' output (sampled each 5 seconds and took the largest one). 32bit SVal: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 23903 kcc 18 0 2250m 1.2g 73m S 95 31.5 4:08.74 helgrind 64bit SVal PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27158 kcc 18 0 2374m 1.4g 115m S 4 35.6 3:54.36 helgrind native run (w/o helgrind) 715 kcc 24 0 1044m 87m 72m S 10 2.3 0:03.38 So, the memory footprint is about 15% larger with 64-bit SVal. Run times: 32-bit: 240-260 seconds. 64-bit: 230-260 seconds. So (modulo noise) the performance does not seem to be affected much. But this was 64-bit binary running on a 64-bit machine (core 2 duo). Profile (for both 64-bit and 32-bit): 184044 32.0981 cacheline_wback 124552 21.7224 cacheline_fetch 105773 18.4473 shadow_mem_set64 55917 9.7522 avl_find_node 10972 1.9136 vgPlain_ssort This profile is common for most other tests I've run so far. Thanks! --kcc On Jan 16, 2008 3:14 PM, Konstantin Serebryany <kon...@gm...> wrote: > Great! > Do you plan to checkin these or similar changes in future? > Maybe under #ifdef so that users do not suffer if they don't need too many > [TL]SETs? > > While I think that something like 24 bits for [TL]SETs is a must for large > apps, it might not be enough. > We may need some garbage collection for [TL]SETs. > Once in a while we could scan all memory, collect all used [TL]SETs and > prune unused. > We could also do garbage collection at thread_join and mutex_destroy (not > every time, of course and only if number of [TL]SETs is getting close to the > limit). > Anyway, I'll try the 64-bit patch to see if garbage collection is really > critical. > > > --kcc > > > On Jan 16, 2008 12:26 AM, Julian Seward < js...@ac...> wrote: > > > > > > [...] change SVal to 64-bit at some point (I vote for it!), > > > > I already did that in Nov 07, but did not commit due to increase > > in run time and space usage. The resulting patch is attached. > > It allows up to 24 bits for lock- and thread-set IDs. I can't > > remember now why I limited it to 24 bits -- could go up to about > > 30 bits. > > > > Really 64-bit for a SVal is too much but 32-bit is not enough. > > Something like 48 bits would be a good compromise, but there is > > no sensible way to do that in C. Oh well. > > > > The patch also changes CacheLine to have a dirty bit in an attempt > > to reduce the performance overhead from writing back cache lines. > > That helps, although it also adds to the complexity and overhead > > of verifying that the shadow memory system is functioning > > correctly. > > > > J > > > > |