From: Zoran V. <zv...@ar...> - 2006-02-04 16:20:54
|
Am 04.02.2006 um 17:08 schrieb Vlad Seryakov: > That could be true on Solaris, but in Linux 2.6 mmap/munmap is very > fast and looking into kernel source it tells you that they conver > sbrk ito mmap imternally but the different is that mmap is > multithreaded-aware while sbrk not. Solaris (1 CPU) Tcl: 8.4.12, threads 16, loops 500000 starting 16 malloc threads...waiting....done: 3 seconds, 938700 usec starting 16 ckalloc threads...waiting....done: 6 seconds, 62454 usec starting 16 _malloc threads...waiting....done: 9 seconds, 755277 usec Linux (1 CPU, 1.8 GHz) Tcl: 8.4.12, threads 16, loops 500000 starting 16 malloc threads...waiting....done: 2 seconds, 298735 usec starting 16 ckalloc threads...waiting....done: 3 seconds, 331197 usec starting 16 _malloc threads...waiting....done: 1 seconds, 323865 usec Mac OSX (1 CPU 1.5Ghz) zoran:~ zoran$ ./m2 Tcl: 8.4.12, threads 16, loops 500000 starting 16 malloc threads...waiting....done: 57 seconds, 300088 usec starting 16 ckalloc threads...waiting....done: 195 seconds, 526369 usec starting 16 _malloc threads...waiting....done: 13 seconds, 869307 usec Mac OSX (2 CPU 867MHz) panther:~ zoran$ ./m2 Tcl: 8.4.12, threads 16, loops 500000 starting 16 malloc threads...waiting....done: 189 seconds, 228665 usec starting 16 ckalloc threads...waiting....done: 730 seconds, 700258 usec (!!!!!) starting 16 _malloc threads...waiting....done: 19 seconds, 958533 usec > > Now, using mmap to allocate block of memory and then re-using that > this is waht i am doing, but i do not use munmap, still it is > possible. > With random allocations from 1-128L, Tcl alloc gives the worst > results, constantly, which means it is good on small allocations only? Aparently it is all above 16284 bytes that uses malloc directly. > > I am not trying to re-invent the wheel, it is just accidentally i > replaced sbrk with mmap and removed mutexes around it and it became > much faster than what we have now, at least on Linux. The only part where it is not faster is single-cpu solaris. I have no idea why. I can test it on 2 cpu solaris next week. Anyway, from all this tests, it appears that the Tcl allocator is slower than anything else, at least for the test-pattern used in your test. Cheers Zoran |