On Sat, 2004-05-22 at 14:04, Nicolai Haehnle wrote:
> It seems to me as if DRM(unlock) in drm_drv.h unlocks without checking
> whether the caller actually holds the global lock. There is no
> LOCK_TEST_WITH_RETURN or similar, and the helper function lock_transfer has
> no check in it either.
> Did I miss something, or is this intended behaviour? It certainly seems
> strange to me.
True. Note that the lock ioctls are only used on contention, but still.
> Also, it is possible for a DRI client to effectively lock up the entire
> machine simply by entering an endless loop after taking the lock. I suppose
> one could still log in remotely and kill the offending process, but that's
> not a realistic option for most people. Switching to a different VT or
> killing the X server does not work, because the X server has to take the
> DRI lock in the process.
> This is a problem that I want to fix (it makes playing around with the R300
> hack Vladimir Dergachev posted an infinite-rebooting nightmare), but I am
> unsure what the best solution would be.
> As far as I can see, the problem is two-fold: One, the X server must be able
> to "break" the lock, and two, it (or the DRM) must somehow disable the
> offending DRI client to prevent the problem from reoccurring.
> I think the simplest solution would look something like this:
> Whenever DRM(lock) is called by a privileged client (i.e. the X server), and
> it needs to sleep because the lock is held by an unprivileged client, a
> watchdog timer is started before we schedule. DRM(unlock) unconditionally
> stops this watchdog timer.
> When the watchdog timer fires, it releases the lock and/or kills the
> offending DRI client.
> Side question: Is killing the offending DRI client enough? When the process
> is killed, the /dev/drm fd is closed, which should automatically release
> the lock. On the other hand, I'm pretty sure that we can't just kill a
> process immediately (unfortunately, I'm not familiar with process handling
> in the kernel). What if, for some reason, the process is in a state where
> it can't be killed yet?
We're screwed? :)
This sounds like an idea for you to play with, but I'm afraid it won't
be useful very often in my experience:
* getting rid of the offending client doesn't help with a wedged
chip (some way to recover from that would be nice...)
* it doesn't help if the X server itself spins with the lock held
I agree with Keith that it's always good to have a second machine to do
serious driver hacking.
> Side question #2: Is it safe to release the DRM lock in the watchdog? There
> might be races where the offending DRI client is currently executing a DRM
> ioctl when the watchdog fires.
Not sure, but this might not be a problem when just killing the
Earthling Michel Dänzer | Debian (powerpc), X and DRI developer
Libre software enthusiast | http://svcs.affero.net/rm.php?r=daenzer