|
From: <don...@is...> - 2010-11-01 18:12:06
|
Vladimir Tzankov writes: > On 11/1/10, Don Cohen <don...@is...> wrote: > > > If thread is interrupted while in exemption-wait the interrupt > > > function will be executed with mutex held. > > So if you interrupt the thread to go into the debugger, then all other > > threads that need to get the mutex (like to return write access) are > > stuck while you debug. And if you then interrupt another to go into > > the debugger you can't even get into the debugger cause you're stuck > > waiting for the mutex before entering the debugger. I gather you agree with all above. I think you'll admit that this is not a good thing for cases where you want most of your threads to keep working (like in a web server) while you look around in the debugger to see what's going wrong with one thread. > > It seems to me that if you interrupt exemption-wait you should not > > wait to get the lock. That's supposed to happen only before you exit > > exemption-wait and you should be able to interrupt and go into the > > debugger before that. > If you are brave enough - unlock the mutex from interrupt function - > but be careful to lock it again on exit. This actually seems reasonable - I interrupt a thread, go into the debugger, find that it's in exemption-wait and unlock the mutex. Now other threads can continue to run while I examine the state of this one. I'd like to automate the first step so that other threads do not have to wait while I figure out that I'm in exemption-wait and unlock the mutex. Where is doc that tells me how to do that? When you interrupt with function NIL to debug, is that the same as some function that I can call? I'm guessing something like (lambda nil (cerror ...)) ? Perhaps a function to show some backtrace of a given thread would be useful. > btw: debugging in presence of many threads is quite adventurous. I > find it mostly useful for fixing deadlocks - for everything else I > prefer logging. I've noticed that. It seems much more sane now that I can control which thread reads the input I type. > Let's explain why it works this way: > Suppose you are stuck in pthread_cond_wait() and you want to interrupt > it. According to POSIX standard this function never returns EINTR (if > it gets signal). Rather it calls signal handler and continues to wait > as if nothing happened. Within signal handler no lisp code can be > executed (and actually very limited set of C functions are allowed) . > The only way to implement thread-interrupt while we are waiting in > pthread_cond_wait() is to signal the condition/exemption, run the > interrupt function and continue to wait. After we signal the exemption > - mutex is reacquired automatically by the OS. As final result - lock > is held when the interrupt functions is executed. Ok, so you can never look up the stack and find that you're in exemption-wait. If you interrupt while in that function you find yourself ready to execute the thing after it. And at that point exemption-wait has returned T. So it's not so easy to even see that the thread you're considering debugging is in exemption-wait ! There really ought to be a way to find out what threads are waiting and for what. > > > On non-local exit from the function - all unwind-protect forms will > > > be executed and mutex in the above example will be unlocked. In > > > case of normal exit from interrupt function - exemption-wait will > > > continue to wait. > > "The function" means the one running the code above? > "interrupt function" means the function passed as :function to > thread-interrupt to be executed in the context of the thread. So I interrupt the thread with function NIL, which does something like a call to cerror after the return from exemption-wait. I then try to return from something outside the exemption-wait loop and get stuck back in it. The typical fight between debugger and unwind protect. Whereas, if I did not use the :test argument to exemption-wait, but wrote my own loop, I could return from that loop and escape the wait. Sounds like a reason to write my own loop. I suppose all of this also applies to the case of interrupt with function T to terminate the thread? > > > With native OS preemptive threads - we do not have control on the > > > scheduler. We may influence it with mutex and exemptions. > > This seems strange. Wouldn't one normally wish to control which > > threads run when more are ready to run than there are processors > > available to run them? I'm not blaming you, of course! > > (Unless you have a lot more to do with the design than I imagine.) > Not sure I understand. You control threads being executed via mutexes > and exemptions - that's all. I argued that I could implement what I need to control whether a thread is runnable by doing something in the iterrupt function that blocks, and that this was all I needed to write my own scheduler. Did you believe that argument? > > > > BTW, it seems to me that there should also be some some way to > > > > control which of several threads waiting to run should be > > > > scheduled next. Why is that either not possible or a bad idea? > > > Use distinct exemptions for this (and share the mutex if needed). > > Just off hand I don't see how this solves the problem. > > Suppose each thread had a priority (integer) - how would you arrange > > to always resume a thread of highest priority among those ready to > > run? > What do you mean by "thread ready to run"? All threads are running all > the time until they exit - just some of them are blocked in calls like > mutex-lock, exemption-wait, read(), etc and do not consume cpu cycles. > Only OS kernel knows when a thread is in such blocked state and why. By ready to run I mean not blocked as described above. I'd like to add another kind of blocking controlled directly by my program, e.g. function (block thread) and (unblock thread). So you're telling me that all this is just not supported by the OS thread implementations. I take it that if you interrupt in a blocked read you at least see some reading function on the stack, right? How about mutex-lock? How about any others? > Looks like you want to process some prioritized tasks. If so - I would > do something like: > 1. define a job > 2. maintain sorted by priority collection of jobs. guard with mutex > and signal an exemption when new highest priority item arrives. > 3. create (pool of) thread(s) - for jobs execution. wait with > exemption on the collection of jobs and execute when available (when > collection is not empty). > 4. In case all threads are busy executing jobs and new highest > priority job arrives and you are absolutely sure you want to execute > it immediately - spawn a new thread or implement kind of job > cancellation in order to release thread from the pool. This is not as good as my proposed scheduler. In particular, when a running job is no longer as worthy as some other job waiting to run I'd like to suspend its thread -- so it can be later resumed when it is again worth running. I also want to be able to notify the scheduler (a thread) when some operation blocks or unblocks so that it can block or unblock others in compensation. It would also be good for the scheduler to be able to read the run time of each thread so that it can tell, e.g., that lisp is now using 2 of the four cpus so we should try to block all but the two most urgent threads. I view this as an inadequacy of the OS thread implementations. I'm guessing that some thread implementations support the sort of thing I'm describing and some don't, so it's probably only an inadequacy of some of them. |