From: Warren W. <war...@gm...> - 2014-06-16 23:06:19
|
The percentage varies with workload. It's usually around 30%, but after 2000 threads it grows to closer to 60%. I've attached some 3D plots that show the gc run time (both in real terms, and as a percentage of total time). Nice idea about the C threads --- but it seems that C threads can't callback lisp unless they have a lisp stack (http://comments.gmane.org/gmane.lisp.cffi.devel/1619). I just get a segfault when I try to call a lisp fn from a C thread created with pthread (create_pthread) On 14-06-16 01:27 PM, Martin Cracauer wrote: > What is the percentage of time spent in actual GC? > (time (sb-ext:gc :full t)) would also be interesting. > > The stacks for the threads are scanned at each GC, and their contents > are conservatively interpreted as pointing to objects that are then > retained (along with more objects which I am working on fixing). The > mechanism for this includes potentially costly calls like > possibly_valid_dynamic_space_pointer() and search_dynamic_space() > > If you have 2000 threads and the stack size is 2 MB and you use 1 MB > of that at the point where you are waiting then you add 2 GB to the > scanning to each generation worth of GC right at the source of the > scanning, too. > > I also look suspiciously at > static int is_in_stack_space(lispobj ptr) {} > as a source for slowdowns with many threads. > > You could use C threads that then call a Lisp function, if you can > live without closures and the like when invoking the threads. > > Martin > > Warren Wilkinson wrote on Thu, Jun 12, 2014 at 11:51:19AM -0700: >> One of my work projects is using a large number of threads (> 1000) and >> we're finding the performance degrades sharply with the number of >> threads. Nearly all of the threads are waiting on waitqueues, but we're >> finding 1500 threads is slow, and by 3000 the system is 1/4 it's former >> speed -- even if these threads spend their whole lifetime on the >> waitqueue, they're very existence is slowing down SBCL dramatically. >> >> My working guess is that SBCL's garbage collector wakes the threads as >> part of it's job, leading to a lot of thread thrashing around. >> >> The problem seems to be triggered when (AND (> threads 1000) (> >> memory-use (sb-ext:bytes-consed-between-gcs))) >> >> >> We're looking at possible solutions, we have several in mind and would >> like any advice anybody can offer. Is this a known issue, does anybody >> else run with this high # of threads? >> >> >> *Possible Solutions* >> 1. Make our threads shorter lived (e.g. don't reuse threads, kill them >> and spawn new ones when needed) >> 2. Increase sb-ext:bytes-consed-between-gcs >> 3. Try to reduce memory usage >> 4. Implement a light weight threading model on top of a low number of >> real threads. >> 5. Try to modify SBCL's GC to not wake threads waiting on posix queues. >> >> p.s. I have test code that can demonstrate the problem. If anybody >> wants to see it, I'll post it. >> >> Cheers, >> Warren Wilkinson >> >> ------------------------------------------------------------------------------ >> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions >> Find What Matters Most in Your Big Data with HPCC Systems >> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. >> Leverages Graph Analysis for Fast Processing & Easy Data Exploration >> http://p.sf.net/sfu/hpccsystems >> _______________________________________________ >> Sbcl-bugs mailing list >> Sbc...@li... >> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |