From: Grant T. <gt...@sw...> - 2002-06-04 20:58:39
|
> On Thu, Mar 07, 2002 at 07:20:28PM -0500, Grant Taylor wrote: >> Anyway, it's a little frentic here now, but after things settle down >> I'll put together a clean patch for mips. Well, I haven't gotten to the clean patch point yet (I still have nasty hacks in syscall entry to make it work) but now we're at the performance tuning stage... I'm trying to make vmadump run faster. Currently it looks like the following things are true: - Freezing is faster than thawing by around a factor of 2. - Having the dumping kernel connected directly to the undumping kernel* pipelines the whole process and makes thawing the only wall time spent. So that's a 33% speedup. - Having four CPUs migrate processes to four other CPUs all in parallel pipelines that; so that's a 4X speedup. - Removing seemingly uneeded icache flushes for non-executable pages in load_map makes no difference, time-wise, on my platform. This is a little surprising given 250MB of cache flushing that went away, but so be it. - The network is not the bottleneck. My cluster will TCP within itself at nearly a gigabit, and the dumps go at maybe 10MB/s. This leaves me wondering why thawing is so much more expensive than freezing. It's curious that writing the arriving dump to a file in tmpfs is way faster than thawing. My assumption is that the extra time is sunk in the allocation of pages through mmap; this is the main difference between the various paths. Is each read of a page's data triggering a page fault, or are the pages allocated at mmap time? Is there something I can do to speed this up? I don't suppose there would be a way to map the pages directly from a dump file and have a sort of execute-in-place semantic between the file and new process? I'm statistically unlikely to touch pages right away in the process, so some of this work seems unnecessary. * This was a PITA to make work right. Does bproc do this? We found that merely closing the TCP socket on the transmitting end would cause a premature connection reset on the receiving end and thus hose the thaw. No variation on lingering or cloexec or even sleeping worked reliably. In the end I make the sender wait in a loop for the TCP_INFO ioctl to return a TCP state of TCP_CLOSE. No way should this be necessary; something is horribly amiss in my kernel, I think. -- Grant Taylor http://www.picante.com/~gtaylor/ Starent Networks http://www.starentnetworks.com/ |