From: Martin C. <cra...@co...> - 2014-06-16 20:27:52
|
What is the percentage of time spent in actual GC? (time (sb-ext:gc :full t)) would also be interesting. The stacks for the threads are scanned at each GC, and their contents are conservatively interpreted as pointing to objects that are then retained (along with more objects which I am working on fixing). The mechanism for this includes potentially costly calls like possibly_valid_dynamic_space_pointer() and search_dynamic_space() If you have 2000 threads and the stack size is 2 MB and you use 1 MB of that at the point where you are waiting then you add 2 GB to the scanning to each generation worth of GC right at the source of the scanning, too. I also look suspiciously at static int is_in_stack_space(lispobj ptr) {} as a source for slowdowns with many threads. You could use C threads that then call a Lisp function, if you can live without closures and the like when invoking the threads. Martin Warren Wilkinson wrote on Thu, Jun 12, 2014 at 11:51:19AM -0700: > One of my work projects is using a large number of threads (> 1000) and > we're finding the performance degrades sharply with the number of > threads. Nearly all of the threads are waiting on waitqueues, but we're > finding 1500 threads is slow, and by 3000 the system is 1/4 it's former > speed -- even if these threads spend their whole lifetime on the > waitqueue, they're very existence is slowing down SBCL dramatically. > > My working guess is that SBCL's garbage collector wakes the threads as > part of it's job, leading to a lot of thread thrashing around. > > The problem seems to be triggered when (AND (> threads 1000) (> > memory-use (sb-ext:bytes-consed-between-gcs))) > > > We're looking at possible solutions, we have several in mind and would > like any advice anybody can offer. Is this a known issue, does anybody > else run with this high # of threads? > > > *Possible Solutions* > 1. Make our threads shorter lived (e.g. don't reuse threads, kill them > and spawn new ones when needed) > 2. Increase sb-ext:bytes-consed-between-gcs > 3. Try to reduce memory usage > 4. Implement a light weight threading model on top of a low number of > real threads. > 5. Try to modify SBCL's GC to not wake threads waiting on posix queues. > > p.s. I have test code that can demonstrate the problem. If anybody > wants to see it, I'll post it. > > Cheers, > Warren Wilkinson > > ------------------------------------------------------------------------------ > HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions > Find What Matters Most in Your Big Data with HPCC Systems > Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. > Leverages Graph Analysis for Fast Processing & Easy Data Exploration > http://p.sf.net/sfu/hpccsystems > _______________________________________________ > Sbcl-bugs mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-bugs -- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Martin Cracauer <cra...@co...> http://www.cons.org/cracauer/ |