|
From: Milind <km...@gm...> - 2010-01-06 07:22:53
|
Hi, I am exploring DRAMsim memory simulator developed by Maryland university. Looks like this tool can take the memory trace (--trace-mem=yes) generated by Valgrind as an input. I am writing an application that will run on a system that has multiple CPU cores and each core having multiple hardware threads. With multiple CPU cores there will be simultaneous memory access requests. I would like to know - whether Valgrind supports multicore systems ? - if so, how accurately does it trace the memory accesses ? - if there is anyone who has already used such multicore, multiple hw thread environment to trace memory accesses ? If feasible, it will be very useful to generate memory access patterns/graphs using DRAMsim and corresponding tools. Thanks in advance, - Milind |
|
From: tom f. <tf...@al...> - 2010-01-07 00:45:44
|
Milind <km...@gm...> writes: [snip] > I am writing an application that will run on a system that > has multiple CPU cores and each core having multiple hardware > threads. With multiple CPU cores there will be simultaneous memory > access requests. > > I would like to know > - whether Valgrind supports multicore systems ? > - if so, how accurately does it trace the memory accesses ? > - if there is anyone who has already used such multicore, multiple hw thread > environment to trace memory accesses ? http://www.valgrind.org/docs/manual/manual-core.html#manual-core.pthreads Cheers, -tom |
|
From: Milind <km...@gm...> - 2010-01-07 02:35:38
|
Thanks for the quick reply Tom. I (again) went over the para that you have pointed out. Most of the discussion in that para and the Helgrind discussion seem to revolve around single CPU. I am not seeing any references to multicore system with concurrent running of threads . Multiple threads, synchronization between them, deadlocks etc all seem to be talking in the context of a single CPU. Hope I am not misreading. I am looking for collecting memory traces for a multicore, multithreaded (hw threads) program. Thanks, - Milind On Thu, Jan 7, 2010 at 6:06 AM, tom fogal <tf...@al...> wrote: > Milind <km...@gm...> writes: > [snip] > > I am writing an application that will run on a system that > > has multiple CPU cores and each core having multiple hardware > > threads. With multiple CPU cores there will be simultaneous memory > > access requests. > > > > I would like to know > > - whether Valgrind supports multicore systems ? > > - if so, how accurately does it trace the memory accesses ? > > - if there is anyone who has already used such multicore, multiple hw > thread > > environment to trace memory accesses ? > > > http://www.valgrind.org/docs/manual/manual-core.html#manual-core.pthreads > > Cheers, > > -tom > |
|
From: tom f. <tf...@al...> - 2010-01-07 04:12:53
|
Milind <km...@gm...> writes: > Thanks for the quick reply Tom. > > I (again) went over the para that you have pointed out. Most of the > discussion in that para and the Helgrind discussion seem to revolve > around single CPU. I am not seeing any references to multicore > system with concurrent running of threads . Multiple threads, > synchronization between them, deadlocks etc all seem to be talking in > the context of a single CPU. Hope I am not misreading. I think the discussion in that paragraph is pretty clear: valgrind will never cause a threaded program to behave incorrectly according to the POSIX spec, but you get 0 concurrency && the time slicing will be different. > I am looking for collecting memory traces for a multicore, > multithreaded (hw threads) program. I think you need to look somewhere else. Side note to the valgrind devs: does the documentation need to be updated here? The last paragraph in "Support for Threads" (linked below) says that the "use of atomic instruction sequences in shared memory between processes will not work reliably", yet the 3.5.0 release notes say that the "lock" prefix on certain instructions is now respected [1]. Due to the ambiguity between threads and processes (i.e., clone), I'm not sure if this is referring to shared address spaces in threads (in which case it probably needs updating) or atomic instructions on data in sysv/posix shmem -- i.e. shm_open'd memory. I've never heard of anyone using the latter, but then again I don't interact with many people that use even posix shared memory anyway. -tom [1] http://valgrind.org/docs/manual/dist.news.html > On Thu, Jan 7, 2010 at 6:06 AM, tom fogal <tf...@al...> wrote: > > > Milind <km...@gm...> writes: > > [snip] > > > I am writing an application that will run on a system that > > > has multiple CPU cores and each core having multiple hardware > > > threads. With multiple CPU cores there will be simultaneous memory > > > access requests. > > > > > > I would like to know > > > - whether Valgrind supports multicore systems ? > > > - if so, how accurately does it trace the memory accesses ? > > > - if there is anyone who has already used such multicore, multiple hw > > thread > > > environment to trace memory accesses ? > > > > > > http://www.valgrind.org/docs/manual/manual-core.html#manual-core.pthreads > > > > Cheers, > > > > -tom > > > > --00504502c7adb3382d047c89ec87 > Content-Type: text/html; charset=ISO-8859-1 > Content-Transfer-Encoding: quoted-printable > > Thanks for the quick reply Tom.<br><br>I (again) went over the para that yo= > u have pointed out. Most of the discussion in that para and the Helgrind di= > scussion seem to revolve around single CPU. I am not seeing any references = > to multicore system with concurrent running of threads . Multiple threads, = > synchronization between them, deadlocks etc all seem to be talking in the c= > ontext of a single CPU. Hope I am not misreading.<br> > <br>I am looking for collecting memory traces for a multicore, multithreade= > d (hw threads) program.<br><br>Thanks,<br>- Milind<br><br><div class=3D"gma= > il_quote">On Thu, Jan 7, 2010 at 6:06 AM, tom fogal <span dir=3D"ltr"><<= > a href=3D"mailto:tf...@al...">tf...@al...</a>></span= > > wrote:<br> > <blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, = > 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Milind <<a hre= > f=3D"mailto:km...@gm...">km...@gm...</a>> writes:<br> > [snip]<br> > <div class=3D"im">> I am writing an application that will run on a syste= > m that<br> > > has multiple CPU cores and each core having multiple hardware<br> > > threads. With multiple CPU cores there will be simultaneous memory<br> > > access requests.<br> > ><br> > > I would like to know<br> > > - whether Valgrind supports multicore systems ?<br> > > - if so, how accurately does it trace the memory accesses ?<br> > > - if there is anyone who has already used such multicore, multiple hw = > thread<br> > > environment to trace memory accesses ?<br> > <br> > </div> =A0<a href=3D"http://www.valgrind.org/docs/manual/manual-core.html#m= > anual-core.pthreads" target=3D"_blank">http://www.valgrind.org/docs/manual/= > manual-core.html#manual-core.pthreads</a><br> > <br> > Cheers,<br> > <font color=3D"#888888"><br> > -tom<br> > </font></blockquote></div><br> > > --00504502c7adb3382d047c89ec87-- |
|
From: Milind <km...@gm...> - 2010-01-07 03:01:45
|
Tom, Appreciate fast responses. I will look for other tools, if available. Thanks, - Milind On Thu, Jan 7, 2010 at 8:24 AM, tom fogal <tf...@al...> wrote: > Milind <km...@gm...> writes: > > Thanks for the quick reply Tom. > > > > I (again) went over the para that you have pointed out. Most of the > > discussion in that para and the Helgrind discussion seem to revolve > > around single CPU. I am not seeing any references to multicore > > system with concurrent running of threads . Multiple threads, > > synchronization between them, deadlocks etc all seem to be talking in > > the context of a single CPU. Hope I am not misreading. > > I think the discussion in that paragraph is pretty clear: valgrind > will never cause a threaded program to behave incorrectly according to > the POSIX spec, but you get 0 concurrency && the time slicing will be > different. > > > I am looking for collecting memory traces for a multicore, > > multithreaded (hw threads) program. > > I think you need to look somewhere else. > > Side note to the valgrind devs: does the documentation need to be > updated here? The last paragraph in "Support for Threads" (linked > below) says that the "use of atomic instruction sequences in shared > memory between processes will not work reliably", yet the 3.5.0 > release notes say that the "lock" prefix on certain instructions is > now respected [1]. Due to the ambiguity between threads and processes > (i.e., clone), I'm not sure if this is referring to shared address > spaces in threads (in which case it probably needs updating) or atomic > instructions on data in sysv/posix shmem -- i.e. shm_open'd memory. > I've never heard of anyone using the latter, but then again I don't > interact with many people that use even posix shared memory anyway. > > -tom > > [1] http://valgrind.org/docs/manual/dist.news.html > > > On Thu, Jan 7, 2010 at 6:06 AM, tom fogal <tf...@al...> wrote: > > > > > Milind <km...@gm...> writes: > > > [snip] > > > > I am writing an application that will run on a system that > > > > has multiple CPU cores and each core having multiple hardware > > > > threads. With multiple CPU cores there will be simultaneous memory > > > > access requests. > > > > > > > > I would like to know > > > > - whether Valgrind supports multicore systems ? > > > > - if so, how accurately does it trace the memory accesses ? > > > > - if there is anyone who has already used such multicore, multiple hw > > > thread > > > > environment to trace memory accesses ? > > > > > > > > > > http://www.valgrind.org/docs/manual/manual-core.html#manual-core.pthreads > > > > > > Cheers, > > > > > > -tom > > > > > > > --00504502c7adb3382d047c89ec87 > > Content-Type: text/html; charset=ISO-8859-1 > > Content-Transfer-Encoding: quoted-printable > > > > Thanks for the quick reply Tom.<br><br>I (again) went over the para that > yo= > > u have pointed out. Most of the discussion in that para and the Helgrind > di= > > scussion seem to revolve around single CPU. I am not seeing any > references = > > to multicore system with concurrent running of threads . Multiple > threads, = > > synchronization between them, deadlocks etc all seem to be talking in the > c= > > ontext of a single CPU. Hope I am not misreading.<br> > > <br>I am looking for collecting memory traces for a multicore, > multithreade= > > d (hw threads) program.<br><br>Thanks,<br>- Milind<br><br><div > class=3D"gma= > > il_quote">On Thu, Jan 7, 2010 at 6:06 AM, tom fogal <span > dir=3D"ltr"><<= > > a href=3D"mailto:tf...@al...">tf...@al... > </a>></span= > > > wrote:<br> > > <blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid > rgb(204, = > > 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Milind <<a > hre= > > f=3D"mailto:km...@gm...">km...@gm...</a>> writes:<br> > > [snip]<br> > > <div class=3D"im">> I am writing an application that will run on a > syste= > > m that<br> > > > has multiple CPU cores and each core having multiple hardware<br> > > > threads. With multiple CPU cores there will be simultaneous > memory<br> > > > access requests.<br> > > ><br> > > > I would like to know<br> > > > - whether Valgrind supports multicore systems ?<br> > > > - if so, how accurately does it trace the memory accesses ?<br> > > > - if there is anyone who has already used such multicore, multiple > hw = > > thread<br> > > > environment to trace memory accesses ?<br> > > <br> > > </div> =A0<a href=3D" > http://www.valgrind.org/docs/manual/manual-core.html#m= > > anual-core.pthreads" target=3D"_blank"> > http://www.valgrind.org/docs/manual/= > > manual-core.html#manual-core.pthreads</a><br> > > <br> > > Cheers,<br> > > <font color=3D"#888888"><br> > > -tom<br> > > </font></blockquote></div><br> > > > > --00504502c7adb3382d047c89ec87-- > |
|
From: Dave G. <go...@mc...> - 2010-01-07 14:23:48
|
On Jan 6, 2010, at 8:54 PM, tom fogal wrote: > Side note to the valgrind devs: does the documentation need to be > updated here? The last paragraph in "Support for Threads" (linked > below) says that the "use of atomic instruction sequences in shared > memory between processes will not work reliably", yet the 3.5.0 > release notes say that the "lock" prefix on certain instructions is > now respected [1]. Due to the ambiguity between threads and processes > (i.e., clone), I'm not sure if this is referring to shared address > spaces in threads (in which case it probably needs updating) or atomic > instructions on data in sysv/posix shmem -- i.e. shm_open'd memory. > I've never heard of anyone using the latter, but then again I don't > interact with many people that use even posix shared memory anyway. > > -tom > > [1] http://valgrind.org/docs/manual/dist.news.html FWIW, MPICH2 uses atomic instructions in posix mmap'ed shared memory to manipulate some lock-free queues for communication. OpenMPI probably does too, although I'm not particularly familiar with OMPI internals. The release notes also say that "communication with other processes [...] through shared memory and coordinated with such atomic instructions" should work: -------8<------ * Genuinely atomic support for x86/amd64/ppc atomic instructions Valgrind will now preserve (memory-access) atomicity of LOCK- prefixed x86/amd64 instructions, and any others implying a global bus lock. Ditto for PowerPC l{w,d}arx/st{w,d}cx. instructions. This means that Valgrinded processes will "play nicely" in situations where communication with other processes, or the kernel, is done through shared memory and coordinated with such atomic instructions. Prior to this change, such arrangements usually resulted in hangs, races or other synchronisation failures, because Valgrind did not honour atomicity of such instructions. -------8<------ So I suspect that last paragraph in "Support for Threads" isn't right about the atomic instruction support. -Dave |