From: Yue Li <xy...@gm...> - 2010-01-15 02:18:22
|
Hi, I'm new to Lisp programming and SBCL, and I'm currently facing some problems on using sbcl threads library. I'm running some simple computation, such as computing the sum of a list of one million integers in a reduction fashion. I would like to parallelize the sum by using threads. Therefore, I usually create about 2-4 threads, each computes a portion of the sum. The results are always correct, however, I found the performance is always as fast as my sequential version. Then I looked at the CPU monitor, and it shows that when doing computation using threads, all the threads only occupy the first core, and very seldomly occupy the second core. I read the SBCL manual, and googled on websites but seems I could not find the answer. My machine is a Dual-core x86-64-fedora12 box, with 4gb memory. Later, I thought it maybe due to the gc. Therefore I used the command: (sb-ext:bytes-consed-between-gcs) And found that the default value is 12582912, I tried to make it larger to 512mb (also tried other values) by using (setf (sb-ext:bytes-consed-between-gcs) (- (expt 2 29) 10))) Then run my multithreading program again, and still, all the threads occupy one of the two cores when running, and the speed up is very trivial. Here did I miss something on configuring the GC? or I am totally out of the correct direction? Thanks for your help, Cheers, Yue |
From: Daniel H. <dhe...@te...> - 2010-01-15 07:23:17
|
On Thu, 14 Jan 2010, Yue Li wrote: > Then run my multithreading program again, and still, all the threads > occupy one of the two cores when running, and the speed up is very > trivial. > > Here did I miss something on configuring the GC? or I am totally out > of the correct direction? I think you hit a common problem. The official SBCL binaries are built single-threaded. To get a multithreaded build, you must build it yourself. This isn't very hard. Download and extract the SBCL sources somewhere. The INSTALL file tells you what to do and how to enable threads (see sections 2.1 and 2.2). I don't think this is a GC issue. Later, Daniel |
From: Yue Li <xy...@gm...> - 2010-01-15 09:21:55
|
Hi, Daniel On Fri, Jan 15, 2010 at 1:23 AM, Daniel Herring <dhe...@te...> wrote: > On Thu, 14 Jan 2010, Yue Li wrote: >> >> Then run my multithreading program again, and still, all the threads >> occupy one of the two cores when running, and the speed up is very >> trivial. >> >> Here did I miss something on configuring the GC? or I am totally out >> of the correct direction? > > I think you hit a common problem. The official SBCL binaries are built > single-threaded. To get a multithreaded build, you must build it yourself. > > This isn't very hard. Download and extract the SBCL sources somewhere. The > INSTALL file tells you what to do and how to enable threads (see sections > 2.1 and 2.2). > Thanks for your help! However, the sbcl installed on my machine is built by myself, and I already enabled sb-thread following the INSTALL file, I think if I did that incorrectly, I could not invoke make-thread to spawn threads. Cheers, Yue > I don't think this is a GC issue. > > Later, > Daniel > |
From: Larry V. <re...@us...> - 2010-01-15 11:50:43
|
> Thanks for your help! However, the sbcl installed on my machine is > built by myself, and I already enabled sb-thread following the INSTALL > file, I think if I did that incorrectly, I could not invoke > make-thread to spawn threads. > My guess would be that this is an operating system issue. This based on the facts that sbcl uses pthread on x86 and pthread doesn't have a way to hint the operating system how to schedule threads. A possible thing could be that the operating system is reluctant to migrate threads to other cores until there is subtantial load on them over time. So if you run your program for several minutes maybe the os will migrate one of them. I may be wrong here to, there could be no/little cost at all to have the threads on different cores and also no/little cost to migrate them. best regards, /larry |
From: Giovanni G. <gi...@ci...> - 2010-01-15 12:51:09
|
Larry Valkama wrote: > My guess would be that this is an operating system issue. since the OP is on Linux, maybe the "taskset" command could help? |
From: <m_m...@ya...> - 2010-01-15 14:05:32
|
Hi, Yue Li <xy...@gm...> writes: >> On Thu, 14 Jan 2010, Yue Li wrote: >>> Here did I miss something on configuring the GC? or I am totally out >>> of the correct direction? >> >> I don't think this is a GC issue. > > Thanks for your help! However, the sbcl installed on my machine is > built by myself, and I already enabled sb-thread following the INSTALL > file, I think if I did that incorrectly, I could not invoke > make-thread to spawn threads. That is true. Your problem is indeed a GC issue. The garbage collector is single threaded and has to stop the world for collection, so everything gets interrupted during collection. The less you cons, of course, the less that happens, but if you have something consing like mad, then you will mostly see the GC thread. The same thing bit me a few days ago. The short term solution is patience, or perhaps having multiple sbcls communicating via sockets, as no other (free) multithreaded CL i am aware of is competitive with SBCL in terms of speed anyway (at least on numerical computations). Except perhaps CMU CL, and I doubt that there things are better on this regard.. The long term solution would be a good parallel GC, but I suspect that these are not very easy to write. Regards, Mario. |
From: Yue Li <xy...@gm...> - 2010-01-15 18:13:11
|
---------- Forwarded message ---------- From: Yue Li <xy...@gm...> Date: Fri, Jan 15, 2010 at 11:49 AM Subject: Re: [Sbcl-help] SBCL threads only occupy one of the cores To: Giovanni Gigante <gi...@ci...> On Fri, Jan 15, 2010 at 6:03 AM, Giovanni Gigante <gi...@ci...> wrote: > Larry Valkama wrote: >> My guess would be that this is an operating system issue. > > since the OP is on Linux, maybe the "taskset" command could help? > Maybe I was wrong, but I think taskset could be used to put a process to another core, but could not schedule a thread to occupy another core? Yue |
From: Yue Li <xy...@gm...> - 2010-01-15 18:13:37
|
On Fri, Jan 15, 2010 at 5:51 AM, Larry Valkama <re...@us...> wrote: > >> Thanks for your help! However, the sbcl installed on my machine is >> built by myself, and I already enabled sb-thread following the INSTALL >> file, I think if I did that incorrectly, I could not invoke >> make-thread to spawn threads. >> > My guess would be that this is an operating system issue. This based on > the facts that sbcl uses pthread on x86 and pthread doesn't have a way > to hint the operating system how to schedule threads. > > A possible thing could be that the operating system is reluctant to > migrate threads to other cores until there is subtantial load on them > over time. So if you run your program for several minutes maybe the os > will migrate one of them. > Though I'm still not sure the reason is due to OS scheduler, but the behavior you just described above is quite true. If I ask each thread to execute a time cost load, say a deadloop, or then, threads will also occupy the other core. But the strange thing is that, after executing for a while, the threads will come back to the first core from the second one again. Thanks, Yue > I may be wrong here to, there could be no/little cost at all to have the > threads on different cores and also no/little cost to migrate them. > > best regards, > /larry > > ------------------------------------------------------------------------------ > Throughout its 18-year history, RSA Conference consistently attracts the > world's best and brightest in the field, creating opportunities for Conference > attendees to learn about information security's most important issues through > interactions with peers, luminaries and emerging and established companies. > http://p.sf.net/sfu/rsaconf-dev2dev > _______________________________________________ > Sbcl-help mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-help > |
From: Paul K. <pv...@pv...> - 2010-01-16 19:18:06
|
In article <3df...@ma...>, Yue Li <xy...@gm...> wrote: > I'm running some simple computation, such as computing the sum of a > list of one million integers in a reduction fashion. I would like to > parallelize the sum by using threads. Therefore, I usually create > about 2-4 threads, each computes a portion of the sum. The results are > always correct, however, I found the performance is always as fast as > my sequential version. Then I looked at the CPU monitor, and it shows > that when doing computation using threads, all the threads only occupy > the first core, and very seldomly occupy the second core. I read the > SBCL manual, and googled on websites but seems I could not find the > answer. Can you post or link to (a reduced version of) your test case? Paul Khuong |
From: Nikodemus S. <nik...@ra...> - 2010-02-26 13:00:07
|
On 16 January 2010 21:17, Paul Khuong <pv...@pv...> wrote: >> I'm running some simple computation, such as computing the sum of a >> list of one million integers in a reduction fashion. I would like to >> parallelize the sum by using threads. Therefore, I usually create >> about 2-4 threads, each computes a portion of the sum. The results are >> always correct, however, I found the performance is always as fast as >> my sequential version. Then I looked at the CPU monitor, and it shows >> that when doing computation using threads, all the threads only occupy >> the first core, and very seldomly occupy the second core. I read the >> SBCL manual, and googled on websites but seems I could not find the >> answer. > > Can you post or link to (a reduced version of) your test case? It may also be that your CPU affinity mask is set to restrict threads from the SBCL process to a single core. SB-CPU-AFFINITY may be of use for you in that case: http://github.com/nikodemus/sb-cpu-affinity (Though if you say that computation _occasionally_ occurs on the second core as well, I doubt affinity mask is the issue. Most likely issue is that your code is not as parallel as you expect it to be for one reason or another.) Cheers, -- Nikodemus |
From: Gabriel D. R. <gd...@in...> - 2010-02-26 17:10:32
|
On Fri, Feb 26, 2010 at 6:03 AM, Nikodemus Siivola <nik...@ra...> wrote: > On 16 January 2010 21:17, Paul Khuong <pv...@pv...> wrote: > >>> I'm running some simple computation, such as computing the sum of a >>> list of one million integers in a reduction fashion. I would like to >>> parallelize the sum by using threads. Therefore, I usually create >>> about 2-4 threads, each computes a portion of the sum. The results are >>> always correct, however, I found the performance is always as fast as >>> my sequential version. Then I looked at the CPU monitor, and it shows >>> that when doing computation using threads, all the threads only occupy >>> the first core, and very seldomly occupy the second core. I read the >>> SBCL manual, and googled on websites but seems I could not find the >>> answer. >> >> Can you post or link to (a reduced version of) your test case? > > It may also be that your CPU affinity mask is set to restrict threads > from the SBCL process to a single core. SB-CPU-AFFINITY may be of use > for you in that case: > > http://github.com/nikodemus/sb-cpu-affinity > > (Though if you say that computation _occasionally_ occurs on the > second core as well, I doubt affinity mask is the issue. Most likely > issue is that your code is not as parallel as you expect it to be for > one reason or another.) Hmm, in that case, it would be surprising that the same computations with ECL-based or CLISP-based builds systematically display a far better load-balancing behavior than with SBCL (or Clozure.) -- Gaby |
From: Nikodemus S. <nik...@ra...> - 2010-02-26 18:34:22
|
On 26 February 2010 19:10, Gabriel Dos Reis <gd...@in...> wrote: >> (Though if you say that computation _occasionally_ occurs on the >> second core as well, I doubt affinity mask is the issue. Most likely >> issue is that your code is not as parallel as you expect it to be for >> one reason or another.) > > Hmm, in that case, it would be surprising that the same computations > with ECL-based or CLISP-based builds systematically display a > far better load-balancing behavior than with SBCL (or Clozure.) Not necessarily: that "one reason or another" may be something in the implementation itself. Can you post a test-case, please? Cheers, -- Nikodemus |