You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
(6) |
2
(6) |
3
(10) |
4
(10) |
|
5
(6) |
6
(6) |
7
(9) |
8
(6) |
9
(6) |
10
(7) |
11
(7) |
|
12
(6) |
13
(6) |
14
(8) |
15
(17) |
16
(10) |
17
(17) |
18
(8) |
|
19
(9) |
20
(7) |
21
(6) |
22
(6) |
23
(6) |
24
(5) |
25
(3) |
|
26
(3) |
27
(3) |
28
(3) |
29
(3) |
30
(2) |
31
(3) |
|
|
From: Jeremy F. <je...@go...> - 2004-12-17 23:48:16
|
On Fri, 2004-12-17 at 20:15 +0000, Greg Parker wrote: > Looks like this area will need careful consideration. My first guess > is that the definitions of thread_suspend and thread_abort (which > don't promise much) will help make correct programs simulate correctly, > though incorrect programs may not always be caught. The tricky bits are always where the VCPU state isn't in its home location. Fortunately, serializing all the threads means we don't have to worry about getting async exceptions/suspends while running code (except for instructions raising exceptions), but around syscalls is tricky for that reason. > > Is there any notion of blocking vs non-blocking RPCs? Can we > > distinguish them? > > The RPC is always synchronous, but I don't know of a way to identify > which ones are going to be short. I currently assume that they are > all potential blockers. This might be the right thing to do in general, > given the threat of a user-level exception handler that needs to run > before the syscall completes. In Unix, the distinction we make between fast and slow syscalls is basically a performance optimisation. If you're doing something quick, like gettimeofday, getpid, or opens of a regular file, then there's no need to go though the process of rescheduling the thread, but you clearly have to do that for anything blocking to prevent a deadlock. In the Mach case, does the same distinction exist, or could any message take an indefinite amount of time to complete? > > Is there anything which doesn't map to the mmap/munmap/mprotect model? > > How common is sharing memory with other processes? > > Those plus mremap and msync probably handle most operations within > a process, but they are incomplete for cross-process operations. > For example, I can request that a particular address range in some > process be mapped at some other address in my process (directly or > copy-on-write). > > Memory sharing is pervasive at the application level. The window > server in particular uses it heavily. Hm, that's going to be a bit tricky. We'll just have to assume that all shared memory is always defined. > > > What restrictions do we have on the use of the address space? > > The address space is 4 GB. None of it is reserved by the kernel. > Some if it is reserved for various purposes, but most of those > restrictions can be dodged by giving appropriate flags to the > dynamic linker, I think. The most inflexible constraints are: > > 1. Executables are generally compiled non-relocatable starting at 0x1000. > This can't be changed. What form would Valgrind take? > 2. The highest part of memory (0xfffe0000+) contains some user code > from the C library and the Objective-C runtime. If necessary, > these could be avoided in Valgrind's codegen. Also present is > the shared pasteboard; I don't know if it can be moved. Does this range also include library static data, or is it purely read- only? Can Valgrind also make use of this library code without stomping on the client state? > 3. The address range 0x80000000..0xc0000000 is strongly preferred > for mapping shared libraries and the main stack. This could > probably be avoided with a multi-stage launcher like that used > on Linux. I don't currently use a multi-stage launcher, but I'm > not yet trying to watch any memory operations. > > The presence of the executable in low memory and libc code in high > memory means that a single user area / Valgrind area split in > the address space won't work. Actually, it sounds to me like you could maybe use a dual address space model. Have Valgrind live in one process, and make it control the target process by injecting mappings into it and setting the CPU state. I guess you'd still need to put the shadow memory into the target address space (so that the instrumentation code can get to it), which means that the available address space is still constrained. It sounds like the address space is going to get pretty crowded. Is 64- bit MacOS an interesting target yet? > If the thread has never been run before (i.e. has just been created with > thread_create), then we do need the thunk. This would work almost like > the thread_create_running case. If the thread has never run, then surely we're just talking about the initial state of the VCPU emulation for that thread? Setting the thread state would never directly affect the real CPU state. > Precisely. My expectation is that Valgrind can do the suspend/abort/ > set_state/resume sequence itself instead of leaving it to the default > handler. Then we can more easily avoid delivering signals to threads > in the middle of a basic block or a syscall transition. Well, I guess the existing thread emulation does everything, like dealing with queued vs unqueued signals, signal masks, selecting which thread to deliver to, etc, etc. It would be nice to not have to duplicate that if possible (the existing design was intended to let the kernel do as much of that as possible under Linux, though the new threading model under discussion would probably do it better). J |
|
From: Jeremy F. <je...@go...> - 2004-12-17 23:48:15
|
On Sat, 2004-12-18 at 09:30 +1100, Paul Mackerras wrote:
> Interesting; Nick and I were discussing using almost an identical
> scheme under Linux just this week.
It's the zeitgeist.
J
|
|
From: Jeremy F. <je...@go...> - 2004-12-17 23:48:15
|
On Sat, 2004-12-18 at 09:46 +1100, Paul Mackerras wrote:
> I don't think that would be a problem, at least not under Linux,
> because none of the threads will end up looking CPU-bound to the Linux
> scheduler, since they will all be sleeping every so often.
Yeah, I can't say I'm deeply worried about the problem. My main
thinking was not to change the execution model, since it doesn't need
changing, and its pretty easy.
> I like the idea of having a single lock and letting the kernel
> scheduler schedule the individual threads. It's simple and it avoids
> us having to think about the possible races that could occur in
> choosing the next thread to run. I'm thinking particularly that there
> could be a race when one thread is choosing another to run, and
> another thread is finishing a blocking syscall and marking itself as
> ready to run.
I think that's pretty easy to deal with: we just make sure that we
update the thread's state in the right order, and use a counting
semaphore (so that wake-before-sleeping doesn't deadlock). I'm actually
thinking we should have a run queue, so that a reschedule is an O(N)
operation rather than having to scan the threads table for a runnable
thread.
J
|
|
From: Paul M. <pa...@sa...> - 2004-12-17 22:46:48
|
Jeremy Fitzhardinge writes: > I'm concerned that using the kernel scheduler, a CPU-bound thread might > get starved out because its amount of work done/scheduling quantum is > much smaller under Valgrind. I don't think that would be a problem, at least not under Linux, because none of the threads will end up looking CPU-bound to the Linux scheduler, since they will all be sleeping every so often. I like the idea of having a single lock and letting the kernel scheduler schedule the individual threads. It's simple and it avoids us having to think about the possible races that could occur in choosing the next thread to run. I'm thinking particularly that there could be a race when one thread is choosing another to run, and another thread is finishing a blocking syscall and marking itself as ready to run. Paul. |
|
From: Paul M. <pa...@sa...> - 2004-12-17 22:35:49
|
Greg Parker writes: > I'm working on a port of Valgrind to Mac OS X, based on Paul > Mackerras's Linux/PPC port. After about a week of bringup I > have TextEdit.app running in simple recompilation mode - no > optimization, no instrumentation, but mostly faithful simulation. Cool! > I started with Paul's "valgrind-2.3.0.CVS-ppc-tar.bz2", because > it looked newest. Is there something else I should be working I just put up a new tarball, which is up to date with CVS. It's at: http://ozlabs.org/~paulus/valgrind-2.3.0.CVS-ppc-041217.tar.bz2 > with? I have several PPC codegen fixes that I'll clean up in > the next week or so, including a correction to stwux and similar; > a correction to mfvrsave/mtvrsave; and a still-incomplete > implementation of lswx/stswx. Interesting - could you send them to me? > My solution is as follows: Each thread in the inferior is a real > Mach thread. Valgrind contains no scheduler and no reimplementation > of the threading primitives. Instead, a single coarse-grained mutex > is used to ensure that only one thread is executing in the Valgrind > core at a time. If some thread is about to start a blocking syscall, > Valgrind's syscall wrapper relinquishes the mutex, executes the syscall, > and then blocks on the mutex before continuing execution. New threads > block on the mutex before starting at their simulated entry point. > Threads that have exhausted their basic block counter release the > mutex and yield. The thread_suspend() trap throws in a few more > curves, but nothing insurmountable. Almost everything else is handled > automatically by the kernel's scheduler and threading primitives. Interesting; Nick and I were discussing using almost an identical scheme under Linux just this week. Paul. |
|
From: Greg P. <gp...@us...> - 2004-12-17 20:15:26
|
Jeremy Fitzhardinge writes: > I'm a bit worried about the precise details of what happens when the > target thread is transitioning into and out of a syscall. In Unix, > there are some very tricky races between signals and blocking syscalls, > which required some particularly nasty/hairy/tricky code to deal with. > It sounds to me that thread_suspend & friends could face the same > problems. Looks like this area will need careful consideration. My first guess is that the definitions of thread_suspend and thread_abort (which don't promise much) will help make correct programs simulate correctly, though incorrect programs may not always be caught. >> [thread_suspend] > I presume that if the target thread isn't one of ours, we should just > pass it through as-is? Exactly. > Is there any notion of blocking vs non-blocking RPCs? Can we > distinguish them? The RPC is always synchronous, but I don't know of a way to identify which ones are going to be short. I currently assume that they are all potential blockers. This might be the right thing to do in general, given the threat of a user-level exception handler that needs to run before the syscall completes. > Is there anything which doesn't map to the mmap/munmap/mprotect model? > How common is sharing memory with other processes? Those plus mremap and msync probably handle most operations within a process, but they are incomplete for cross-process operations. For example, I can request that a particular address range in some process be mapped at some other address in my process (directly or copy-on-write). Memory sharing is pervasive at the application level. The window server in particular uses it heavily. > What restrictions do we have on the use of the address space? The address space is 4 GB. None of it is reserved by the kernel. Some if it is reserved for various purposes, but most of those restrictions can be dodged by giving appropriate flags to the dynamic linker, I think. The most inflexible constraints are: 1. Executables are generally compiled non-relocatable starting at 0x1000. This can't be changed. 2. The highest part of memory (0xfffe0000+) contains some user code from the C library and the Objective-C runtime. If necessary, these could be avoided in Valgrind's codegen. Also present is the shared pasteboard; I don't know if it can be moved. 3. The address range 0x80000000..0xc0000000 is strongly preferred for mapping shared libraries and the main stack. This could probably be avoided with a multi-stage launcher like that used on Linux. I don't currently use a multi-stage launcher, but I'm not yet trying to watch any memory operations. The presence of the executable in low memory and libc code in high memory means that a single user area / Valgrind area split in the address space won't work. > > * task_terminate: If the target task is mach_task_self(), this > > pretty much looks like exit(). > > So a task is the whole collection of threads? Yes. A Mach task is an address space plus a set of threads. For everything we'd care about "task" is equivalent to "process". > > * thread_create: The new thread needs Valgrind thread data attached > > to it. No register state is attached to the thread here. > > > > * thread_create_running: The new thread needs Valgrind thread data > > attached to it. The thread's starting PC needs to be replaced by > > a thunk that switches to the simulator first. > > Well, for these, couldn't we create a thread and start it, setting the > state to Suspended or something; it would only become runnable once > thread_create_running got called on it? thread_create_running both creates and starts the thread; if we don't mangle the parameters going into thread_create_running, then we'll never get another chance. However, this mangling is easy and already works. For thread_create, we'd create the Valgrind thread shadow and wait for someone to call thread_set_state and thread_resume to finish launching the thread. > > * thread_set_state: Writes a suspended thread's registers. Valgrind > > would manipulate the virtual register state. If the PC were changed, > > Valgrind would need to replace it with a thunk. > > No, I think it would just need to update the virtual PC. When it starts > running that thread again, it will be mapped to some generated code. Yes, you're right. If thread_set_state is called on a thread that is already running, then that thread must currently be inside a Valgrind- controlled syscall or yield. The real PC should be unchanged, and when the thread comes out of the syscall or yield we will see its new virtual PC and push it back into generated code. If the thread has never been run before (i.e. has just been created with thread_create), then we do need the thunk. This would work almost like the thread_create_running case. > > * thread_terminate: Destroys a thread. Valgrind would need to > > clean up its own thread data, and (like thread_suspend) make > > sure the dying thread didn't take any Valgrind locks with it. > > Is there any notion of passing thread termination info to another thread > (like pthread_join)? No, not via thread_terminate. All of that is handled in the user-space pthread library. (I think the pthread implementation records the return value in the pthread struct and immediately terminates the Mach thread, even if pthread_join() is not yet complete. In any case, it's not our problem.) > So is the exception handler a thread which listens on a port for > exceptions to appear, and then does the appropriate thing? In the case > of a Unix-like signal, it would use the thread_suspend/set_state/resume > mechanism to deliver the signal? Precisely. My expectation is that Valgrind can do the suspend/abort/ set_state/resume sequence itself instead of leaving it to the default handler. Then we can more easily avoid delivering signals to threads in the middle of a basic block or a syscall transition. -- Greg Parker gp...@us... gp...@se... |
|
From: Nicholas N. <nj...@ca...> - 2004-12-17 11:35:14
|
On Thu, 16 Dec 2004, Jeremy Fitzhardinge wrote: > Nick has already made a good start on that; there's now a good framework > for determining where various pieces of code should live, depending on > whether they're CPU, OS or CPU+OS specific. There's still a lot of > stuff to be moved, but that process will necessarily be driven by ports. The CPU-specific bits have been factored out pretty well, mostly thanks to Paul's PPC work -- I'd guess about 90% done, at least for 32-bit archs. Getting 64-bit cleanness will take more work. The factoring out the OS-specific and CPU+OS-specific parts is much less well done, because I just abstracted out a few really obvious things. A real port will identify lots more pieces of Linux-specific code in the current Valgrind. The comment-out-code-as-necessary approach is a perfectly good one, at least in the initial stages of a port. N |
|
From: Jeremy F. <je...@go...> - 2004-12-17 08:52:55
|
On Thu, 2004-12-16 at 21:19 -0500, Greg Parker wrote:
> > Is this used much? Is it done cross-process? (Could we see threads
> > being asynchronously suspended by someone outside of Valgrind's
> > control?)
>
> thread_suspend() is common internal to a process. In particular, the
> Objective-C runtime uses a "suspend all other threads" mechanism to
> avoid locking in an otherwise time-critical piece of code.
I presume this is for stuff which rarely changes and is overwhelmingly
read-only; otherwise thread_suspend seems like a very heavyweight way of
doing locking.
> It's even possible to install a thread into another process by
> calling thread_create() with an appropriate task_t. Lots of "system
> hacks" do just this to change program behavior. [...]
> Valgrind could periodically check the task's
> thread list, and complain or panic if there's a thread it doesn't
> recognize.
Nah, if someone does that, presumably they know what they're doing (ie,
some debugging tool).
>
> > Do we need to tell the kernel about thread_suspend calls at all? A
> > simpler solution would be to implement it entirely within the scheduler
> > by adding a "suspended" thread state. Doing it to yourself would
> > immediately give up the CPU; doing it to someone else means they won't
> > get it until they're resumed.
>
> Hiding thread_suspend() from the kernel is possible, but the complexities
> get bigger the more I think about it. You'd need a suspend count; easy.
> You'd need to catch threads on the way out of system calls; not hard.
> You'd need to do something with thread_abort(), which does interrupt the
> syscall that a suspended thread is in; not so easy. You'd need to generate
> the expected page fault if a thread suspended in a syscall is aborted but
> then thread_set_state() is not called to change the registers before
> the thread resumes; yuck. I expect it will be easier to let the kernel
> do the work and add just a bit of handling for Valgrind's locks.
I'm a bit worried about the precise details of what happens when the
target thread is transitioning into and out of a syscall. In Unix,
there are some very tricky races between signals and blocking syscalls,
which required some particularly nasty/hairy/tricky code to deal with.
It sounds to me that thread_suspend & friends could face the same
problems.
For example, when a thread is about to enter a syscall and schedule a
new thread, it would look like:
1. next = select_next()
2. next->state = RUNNABLE
3. me->state = SYSWAIT
4. up(next)
* from this point on, we could be suspended at any moment
* A: at this instant; the syscall hasn't run yet
5. do_syscall(...) - B: suspend in syscall
* C: syscall complete, results not recorded
6. update VCPU state with syscall results
* D: syscall complete, results recorded
7. me->state = RUNNABLE
8. down(me) E: thread quiescent
In states A-E, we need to be careful that we can sanely emulate whatever
happens in an uninstrumented program. The particularly problematic
state is C (and maybe B), because the syscall has completed, and
presumably caused some side effects, but the thread's state hasn't been
updated to reflect that.
Maybe Mach has a neat solution for this. The ideal would be a way of
deferring suspends until the syscall has been entered, and then
atomically deferring them again as soon as it finishes.
How does a page fault manifest itself?
> > > Several other Mach calls need to be caught and handled specially,
> > > but as far as I can see the rest are all straightforward.
> >
> > I'm interested in the details.
>
> First of all, all Mach calls use a message-based RPC interface, and
> they all pass through a single system call. Valgrind needs to catch
> that system call, parse the message, and decide what Mach call is
> intended based on that. There's at least one case where the RPC
> message ID is the same for two different calls, depending on who
> the recipient of the message is. Hopefully there won't be any
> interesting cases with hard-to-resolve conflicts.
I presume that if the target thread isn't one of ours, we should just
pass it through as-is?
Is there any notion of blocking vs non-blocking RPCs? Can we
distinguish them?
> Interesting Mach calls for Valgrind include:
>
> * Many virtual memory operations. There's a bunch of them, and they
> all need to be detected so Valgrind's memory map can be updated.
> On the plus side, Valgrind can use the vm_region() call to get
> info about the kernel's map (not unlike /proc/self/maps, I think.
> Mac OS X has no /proc filesystem.)
Is there anything which doesn't map to the mmap/munmap/mprotect model?
How common is sharing memory with other processes?
What restrictions do we have on the use of the address space?
> * task_set_emulation: Installs user-space system call handlers.
> I don't know whether anyone uses this, or even whether it
> is implemented on Mac OS X. If it is used, Valgrind would
> probably catch it and substitute a thunk of its own that
> would switch back into the simulator.
Yep. In the meantime, we just fail it and see what breaks.
> * task_terminate: If the target task is mach_task_self(), this
> pretty much looks like exit().
So a task is the whole collection of threads?
> * thread_create: The new thread needs Valgrind thread data attached
> to it. No register state is attached to the thread here.
>
> * thread_create_running: The new thread needs Valgrind thread data
> attached to it. The thread's starting PC needs to be replaced by
> a thunk that switches to the simulator first.
Well, for these, couldn't we create a thread and start it, setting the
state to Suspended or something; it would only become runnable once
thread_create_running got called on it?
>
> * thread_get_state: Reads a suspended thread's registers. Valgrind
> would substitute the virtual register state.
>
> * thread_set_state: Writes a suspended thread's registers. Valgrind
> would manipulate the virtual register state. If the PC were changed,
> Valgrind would need to replace it with a thunk.
No, I think it would just need to update the virtual PC. When it starts
running that thread again, it will be mapped to some generated code.
> * thread_terminate: Destroys a thread. Valgrind would need to
> clean up its own thread data, and (like thread_suspend) make
> sure the dying thread didn't take any Valgrind locks with it.
Is there any notion of passing thread termination info to another thread
(like pthread_join)?
> * Exceptions: Mach exceptions are sort of like Unix signals,
> but the exception is delivered to a designated thread on
> behalf of the thread actually taking the exception.
> Dealing with exceptions might be easier for Valgrind than
> dealing with signals. Signals on Mac OS X originate as
> Mach exceptions, so it might be possible for Valgrind to
> catch them at the exception level and avoid some of the
> signal ugliness. (At least we could avoid "arbitrary thread
> unexpectedly jumps to arbitrary handler".)
So is the exception handler a thread which listens on a port for
exceptions to appear, and then does the appropriate thing? In the case
of a Unix-like signal, it would use the thread_suspend/set_state/resume
mechanism to deliver the signal?
J
|
|
From: <js...@ac...> - 2004-12-17 03:57:18
|
Nightly build on phoenix ( SuSE 9.1 ) started at 2004-12-17 03:50:00 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow insn_sse: valgrind ./insn_sse insn_sse2: (skipping, prereq failed: ../../../tests/cputest x86-sse2) int: valgrind ./int rm: cannot remove `vgcore.pid*': No such file or directory (cleanup operation failed: rm vgcore.pid*) pushpopseg: valgrind ./pushpopseg rcl_assert: valgrind ./rcl_assert seg_override: valgrind ./seg_override -- Finished tests in none/tests/x86 ------------------------------------ yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 187 tests, 5 stderr failures, 0 stdout failures ================= corecheck/tests/as_mmap (stderr) corecheck/tests/fdleak_fcntl (stderr) memcheck/tests/scalar (stderr) memcheck/tests/writev (stderr) memcheck/tests/zeropage (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <to...@co...> - 2004-12-17 03:25:45
|
Nightly build on dunsmere ( Fedora Core 3 ) started at 2004-12-17 03:20:03 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow -- Finished tests in none/tests/x86 ------------------------------------ yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 192 tests, 12 stderr failures, 1 stdout failure ================= corecheck/tests/fdleak_cmsg (stderr) corecheck/tests/fdleak_fcntl (stderr) corecheck/tests/fdleak_ipv4 (stderr) corecheck/tests/fdleak_socketpair (stderr) memcheck/tests/badpoll (stderr) memcheck/tests/buflen_check (stderr) memcheck/tests/execve (stderr) memcheck/tests/execve2 (stderr) memcheck/tests/scalar (stderr) memcheck/tests/scalar_exit_group (stderr) memcheck/tests/scalar_supp (stderr) memcheck/tests/writev (stderr) none/tests/exec-sigmask (stdout) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-12-17 03:21:51
|
Nightly build on audi ( Red Hat 9 ) started at 2004-12-17 03:15:04 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow seg_override: valgrind ./seg_override -- Finished tests in none/tests/x86 ------------------------------------ yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 192 tests, 12 stderr failures, 0 stdout failures ================= corecheck/tests/fdleak_cmsg (stderr) corecheck/tests/fdleak_fcntl (stderr) corecheck/tests/fdleak_ipv4 (stderr) corecheck/tests/fdleak_socketpair (stderr) memcheck/tests/badpoll (stderr) memcheck/tests/buflen_check (stderr) memcheck/tests/execve (stderr) memcheck/tests/execve2 (stderr) memcheck/tests/scalar (stderr) memcheck/tests/scalar_exit_group (stderr) memcheck/tests/scalar_supp (stderr) memcheck/tests/writev (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-12-17 03:13:51
|
Nightly build on ginetta ( Red Hat 8.0 ) started at 2004-12-17 03:10:02 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow insn_cmov: valgrind ./insn_cmov insn_fpu: valgrind ./insn_fpu insn_mmx: valgrind ./insn_mmx insn_mmxext: valgrind ./insn_mmxext insn_sse: valgrind ./insn_sse insn_sse2: (skipping, prereq failed: ../../../tests/cputest x86-sse2) int: valgrind ./int rm: cannot remove `vgcore.pid*': No such file or directory (cleanup operation failed: rm vgcore.pid*) pushpopseg: valgrind ./pushpopseg rcl_assert: valgrind ./rcl_assert seg_override: valgrind ./seg_override -- Finished tests in none/tests/x86 ------------------------------------ yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 192 tests, 1 stderr failure, 0 stdout failures ================= memcheck/tests/scalar (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-12-17 03:08:34
|
Nightly build on alvis ( Red Hat 7.3 ) started at 2004-12-17 03:05:02 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow insn_mmxext: valgrind ./insn_mmxext insn_sse: valgrind ./insn_sse insn_sse2: (skipping, prereq failed: ../../../tests/cputest x86-sse2) int: valgrind ./int rm: cannot remove `vgcore.pid*': No such file or directory (cleanup operation failed: rm vgcore.pid*) pushpopseg: valgrind ./pushpopseg rcl_assert: valgrind ./rcl_assert seg_override: valgrind ./seg_override -- Finished tests in none/tests/x86 ------------------------------------ yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 192 tests, 3 stderr failures, 1 stdout failure ================= memcheck/tests/scalar (stderr) memcheck/tests/vgtest_ume (stderr) none/tests/susphello (stdout) none/tests/susphello (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-12-17 03:04:08
|
Nightly build on standard ( Red Hat 7.2 ) started at 2004-12-17 03:00:02 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow insn_mmxext: valgrind ./insn_mmxext insn_sse: valgrind ./insn_sse insn_sse2: (skipping, prereq failed: ../../../tests/cputest x86-sse2) int: valgrind ./int rm: cannot remove `vgcore.pid*': No such file or directory (cleanup operation failed: rm vgcore.pid*) pushpopseg: valgrind ./pushpopseg rcl_assert: valgrind ./rcl_assert seg_override: valgrind ./seg_override -- Finished tests in none/tests/x86 ------------------------------------ yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 192 tests, 3 stderr failures, 1 stdout failure ================= memcheck/tests/scalar (stderr) memcheck/tests/vgtest_ume (stderr) none/tests/susphello (stdout) none/tests/susphello (stderr) make: *** [regtest] Error 1 |
|
From: Greg P. <gp...@us...> - 2004-12-17 02:19:42
|
Jeremy Fitzhardinge writes: > Perhaps; the determinism is really only meaningful for completely > CPU-bound programs anyway. The real point I was trying to make is that > its round-robin, and therefore cannot cause starvation. > > I'm concerned that using the kernel scheduler, a CPU-bound thread might > get starved out because its amount of work done/scheduling quantum is > much smaller under Valgrind. Ah, I see. In that case we can cheat on Mach: there are calls to set the scheduling policy, and round-robin is one of the options. Some programs use soft-real-time threads for video or audio quality, but they would probably overflow their quanta anyway under Valgrind. > > Mach's thread_suspend() call allows a thread to arbitrarily halt > > another thread's execution until thread_resume() is called. > > What's the behaviour of thread_suspend on a thread which is already > sleeping/blocked in a syscall? Does it interrupt the syscall, or does > it go into a suspended state when the syscall finishes? How done a > thread get un-suspended? thread_suspend() never interrupts syscalls. The kernel "may" pause the thread in the middle of a syscall, or it "may" continue the thread until it reaches the end of a syscall and then pause it there. Syscalls can be interrupted by calling thread_suspend() followed by thread_abort(). Then one of two things will happen. Either the thread requesting the interruption will call thread_set_state() to change the PC and register state before the interrupted thread resumes; or else the aborted thread will take a page fault once it is resumed. The suspend/abort/set_state/resume sequence looks a lot like a signal. Threads are unsuspended when thread_resume() is called enough times to balance the thread_suspend() calls. > Is this used much? Is it done cross-process? (Could we see threads > being asynchronously suspended by someone outside of Valgrind's > control?) thread_suspend() is common internal to a process. In particular, the Objective-C runtime uses a "suspend all other threads" mechanism to avoid locking in an otherwise time-critical piece of code. In a Mach environment, anything can happen across process boundaries. However, I don't think cross-process suspension is common outside debugging tools, so it should be safe to ignore. It's even possible to install a thread into another process by calling thread_create() with an appropriate task_t. Lots of "system hacks" do just this to change program behavior. I don't think there's any way Valgrind could hope to catch this, so the answer will probably be "don't do that". Valgrind could periodically check the task's thread list, and complain or panic if there's a thread it doesn't recognize. > Do we need to tell the kernel about thread_suspend calls at all? A > simpler solution would be to implement it entirely within the scheduler > by adding a "suspended" thread state. Doing it to yourself would > immediately give up the CPU; doing it to someone else means they won't > get it until they're resumed. Hiding thread_suspend() from the kernel is possible, but the complexities get bigger the more I think about it. You'd need a suspend count; easy. You'd need to catch threads on the way out of system calls; not hard. You'd need to do something with thread_abort(), which does interrupt the syscall that a suspended thread is in; not so easy. You'd need to generate the expected page fault if a thread suspended in a syscall is aborted but then thread_set_state() is not called to change the registers before the thread resumes; yuck. I expect it will be easier to let the kernel do the work and add just a bit of handling for Valgrind's locks. > Unix/Linux has a similar problem with SIGSTOP, which has the worse > property of being multicast as well as unicast (I'm assuming > thread_suspend can only apply to one thread at a time). There is no mass suspension operation that works at anything less than the task level, so we're safe here. Unix signals still exist, but signals (and Mach exceptions, which are somewhat signal-like) are uncommon enough that Valgrind would still be useful without them. > > Several other Mach calls need to be caught and handled specially, > > but as far as I can see the rest are all straightforward. > > I'm interested in the details. First of all, all Mach calls use a message-based RPC interface, and they all pass through a single system call. Valgrind needs to catch that system call, parse the message, and decide what Mach call is intended based on that. There's at least one case where the RPC message ID is the same for two different calls, depending on who the recipient of the message is. Hopefully there won't be any interesting cases with hard-to-resolve conflicts. Interesting Mach calls for Valgrind include: * Many virtual memory operations. There's a bunch of them, and they all need to be detected so Valgrind's memory map can be updated. On the plus side, Valgrind can use the vm_region() call to get info about the kernel's map (not unlike /proc/self/maps, I think. Mac OS X has no /proc filesystem.) * task_set_emulation: Installs user-space system call handlers. I don't know whether anyone uses this, or even whether it is implemented on Mac OS X. If it is used, Valgrind would probably catch it and substitute a thunk of its own that would switch back into the simulator. * task_terminate: If the target task is mach_task_self(), this pretty much looks like exit(). * thread_create: The new thread needs Valgrind thread data attached to it. No register state is attached to the thread here. * thread_create_running: The new thread needs Valgrind thread data attached to it. The thread's starting PC needs to be replaced by a thunk that switches to the simulator first. * thread_get_state: Reads a suspended thread's registers. Valgrind would substitute the virtual register state. * thread_set_state: Writes a suspended thread's registers. Valgrind would manipulate the virtual register state. If the PC were changed, Valgrind would need to replace it with a thunk. * thread_terminate: Destroys a thread. Valgrind would need to clean up its own thread data, and (like thread_suspend) make sure the dying thread didn't take any Valgrind locks with it. * Scheduling policy: Valgrind probably doesn't care. Valgrind is probably intrusive enough to break soft-real-time threads no matter what, and other programs probably aren't sensitive enough to scheduling differences to care. * Exceptions: Mach exceptions are sort of like Unix signals, but the exception is delivered to a designated thread on behalf of the thread actually taking the exception. Dealing with exceptions might be easier for Valgrind than dealing with signals. Signals on Mac OS X originate as Mach exceptions, so it might be possible for Valgrind to catch them at the exception level and avoid some of the signal ugliness. (At least we could avoid "arbitrary thread unexpectedly jumps to arbitrary handler".) -- Greg Parker gp...@us... gp...@se... |
|
From: Jeremy F. <je...@go...> - 2004-12-17 00:48:07
|
On Thu, 2004-12-16 at 19:15 -0500, Greg Parker wrote: > Keeping the deterministic scheduler might in fact be worthwhile, > especially if it makes fault reproduction easier. On the other > hand, in an application environment with networking, user events, > and a window server, you might not see reproducibility even with > a friendly scheduler. Perhaps; the determinism is really only meaningful for completely CPU-bound programs anyway. The real point I was trying to make is that its round-robin, and therefore cannot cause starvation. I'm concerned that using the kernel scheduler, a CPU-bound thread might get starved out because its amount of work done/scheduling quantum is much smaller under Valgrind. > > So, I was thinking, one mutex per thread, and each thread chooses and > > wakes its successor as it is about to give up the CPU. > > I think such a mechanism would work. It wouldn't be hard as long > as the code that yields and runs syscalls is careful about memory > synchronization and execution ordering during the handoff. Yep. I'm assuming that doing a syscall is a memory barrier (it certainly is on x86). The code would basically look like: next = select_successor() next->state = Runnable me->state = SysWait /* or perhaps Suspend; see below */ wake(next) /* no global state touched here */ sleep() /* might return immediately if next has already woken us */ > Mach's thread_suspend() call allows a thread to arbitrarily halt > another thread's execution until thread_resume() is called. There > are two cases where careless handling of a Valgrind mutex would > cause deadlock: > > 1. Current thread releases the Valgrind lock and then calls > thread_suspend() on another thread; but the target thread > acquires the Valgrind lock immediately after the current > thread releases it and before the target gets suspended. > Now the Valgrind lock is held by a suspended thread, and > all other threads are waiting for that lock. > Solution: keep the Valgrind lock when you call thread_suspend(). > > 2. Current thread keeps the Valgrind lock while calling > thread_suspend(mach_thread_self()), putting itself to > sleep while it still holds the Valgrind lock. > Solution: keep the Valgrind lock when you call thread_suspend(), > unless you are calling thread_suspend(mach_thread_self()). What's the behaviour of thread_suspend on a thread which is already sleeping/blocked in a syscall? Does it interrupt the syscall, or does it go into a suspended state when the syscall finishes? How done a thread get un-suspended? Is this used much? Is it done cross-process? (Could we see threads being asynchronously suspended by someone outside of Valgrind's control?) Do we need to tell the kernel about thread_suspend calls at all? A simpler solution would be to implement it entirely within the scheduler by adding a "suspended" thread state. Doing it to yourself would immediately give up the CPU; doing it to someone else means they won't get it until they're resumed. Unix/Linux has a similar problem with SIGSTOP, which has the worse property of being multicast as well as unicast (I'm assuming thread_suspend can only apply to one thread at a time). > Several other Mach calls need to be caught and handled specially, > but as far as I can see the rest are all straightforward. I'm interested in the details. J |
|
From: Greg P. <gp...@us...> - 2004-12-17 00:15:57
|
Jeremy Fitzhardinge writes: > Nick has already made a good start on that; there's now a good framework > for determining where various pieces of code should live, depending on > whether they're CPU, OS or CPU+OS specific. There's still a lot of > stuff to be moved, but that process will necessarily be driven by ports. This sounds good. I'll take a look at it sometime from my Mac OS X -centric view. > I was planning on keeping the existing deterministic scheduling, rather > than letting the kernel scheduler make the decision; I'm concerned that > we'd get strange competition between IO and CPU bound threads, since > CPU-bound threads will be about 20-50x more CPU consuming, but IO-bound > threads will look much the same to the kernel. Keeping the deterministic scheduler might in fact be worthwhile, especially if it makes fault reproduction easier. On the other hand, in an application environment with networking, user events, and a window server, you might not see reproducibility even with a friendly scheduler. > So, I was thinking, one mutex per thread, and each thread chooses and > wakes its successor as it is about to give up the CPU. I think such a mechanism would work. It wouldn't be hard as long as the code that yields and runs syscalls is careful about memory synchronization and execution ordering during the handoff. > BTW, what's the thread_suspend trap, and what difficulties does it > introduce? Mach's thread_suspend() call allows a thread to arbitrarily halt another thread's execution until thread_resume() is called. There are two cases where careless handling of a Valgrind mutex would cause deadlock: 1. Current thread releases the Valgrind lock and then calls thread_suspend() on another thread; but the target thread acquires the Valgrind lock immediately after the current thread releases it and before the target gets suspended. Now the Valgrind lock is held by a suspended thread, and all other threads are waiting for that lock. Solution: keep the Valgrind lock when you call thread_suspend(). 2. Current thread keeps the Valgrind lock while calling thread_suspend(mach_thread_self()), putting itself to sleep while it still holds the Valgrind lock. Solution: keep the Valgrind lock when you call thread_suspend(), unless you are calling thread_suspend(mach_thread_self()). Several other Mach calls need to be caught and handled specially, but as far as I can see the rest are all straightforward. -- Greg Parker gp...@us... gp...@se... |