On Sunday 30 Jun 2002 8:49 pm, Keith Whitwell scribed numinously:"
> Tim Smith wrote:
> > I found a few other ways of provoking the problem while I was at it,
> > and dragging an xclock window over the 3D view did it too (with a
> > window manager and "solidmove" turned on). In fact, I also managed to
> > provoke the lockup by persistently dragging xclock around over a
> > maximised glxgears in 1600x1200 but it took a lot of mouse waggling.
> > The fix turns out to be very simple of course:
> > --- radeon_state.c 27 Jun 2002 17:56:39 -0000 1.17
> > +++ radeon_state.c 29 Jun 2002 14:52:20 -0000
> > @@ -48,7 +48,8 @@ static inline void radeon_emit_clip_rect
> > DRM_DEBUG( " box: x1=%d y1=%d x2=%d y2=%d\n",
> > box->x1, box->y1, box->x2, box->y2 );
> > - BEGIN_RING( 4 );
> > + BEGIN_RING( 6 );
> > + RADEON_WAIT_UNTIL_2D_IDLE();
> > OUT_RING( CP_PACKET0( RADEON_RE_TOP_LEFT, 0 ) );
> > OUT_RING( (box->y1 << 16) | box->x1 );
> > OUT_RING( CP_PACKET0( RADEON_RE_WIDTH_HEIGHT, 0 ) );
> > In the course of poking around, I enabled the code that causes the
> > scratch registers to be written out to memory by the card when they are
> > updated, and extended the getparam ioctl so that user space could
> > obtain them with a quick ioctl rather than doing MMIO. This has made
> > things run quite a bit smoother since it no longer has to hammer the
> > bus to get the value (though it didn't fix the problem; I was wondering
> > whether or not reading the registers by MMIO would muck with the
> > command FIFO behind the CCE microengine's back, but apparently not). Is
> > there some other reason why this is a bad idea or should I prepare a
> > patch? BTW the radeonClear() throttling doesn't call delay(), so that
> > loop will get optimised out.
> Gareth tried to get this working in the initial driver, but didn't get it
> to be reliable. I've got an older card here so I can test a patch
> against that.
OK. I'll get a clean tree and make a patch against it. My tree is so full of
debug stuff it's getting silly. Linus suggested using a generic copy in RAM
to get the information to userspace, but I'll stick with the ioctl
extension for now because (a) I understand it and (b) I only touch the
radeon code that way.
> On the contents of your patch: Is it necessary to put this wait in
> emit_clip_rect? Where is the 2d activity you're waiting for coming from?
> Is it the X server, or the Clear or Swapbuffers ioctls? It would be
> good to narrow it down & only do the wait where necessary...
Actually, on the way in to work this morning I began wondering whether or
not it *is* 2D activity, or whether it's the WAIT_HOST_IDLECLEAN that
WAIT_UNTIL_2D_IDLE also does that is fixing it. I'd tended to assume it was
2D since the submenu whose going-away caused the lockup never gets
However, when I move xclock over the view and lock it up that way, it locks
up instead in waitForFrameCompletion() which never terminates. I tried
counting back in the ring buffer from where the RPTR had got to, enough to
cover the microengine's buffer and the command FIFO on the principle that
even if the microengine was emitting more into the FIFO than it was reading
from the ring that ought to give me at least a guess as to where it was
getting to, and it always pointed to somewhere in the output of
I've tried putting a wait-for-idle in RADEONLeaveServer. That (apart from
murdering performance as expected) didn't fix the problem. So I don't think
I'm running without PageFlipping, and don't see any SwapBuffers ioctls in my
debug output, so that rules that out.
I got suspicious of the clear ioctl before and put a wait-for-idle (full
blown host-wait) call in after each clear. That didn't fix it either, so I
don't think it's the clear.
It's always radeon_cp_cmdbuf, and it never, ever locks up when nrect < 2,
and usually nrect >=3 (I think I might have seen a lockup with nrect == 2
but it's harder to provoke). emit_clip_rect() therefore seemed like the
safest place to put the wait. In the common case where nrect==1 it gets
called once only for the whole buffer, and most of the time it should be a
I've discovered that my general work NDA covers me to look at the Radeon
docs we have, so I've been able to decode the RBBM_STATUS that gets
reported each time. That is 100% consistent and rechecking it now tells me
o There is a request from the host interface on the backbone (bit 8)
o There is a request from the Command FIFO in the retry buffer (bit 13)
o The Command FIFO pipeline is busy (bit 14)
o The CP command stream is busy (bit 16)
o The 3D setup engine is busy (bit 20)
o Stuff[tm] is busy (bit 31)
That would seem to support the idea that it's not actually a 2D engine busy
problem :-/ Arrgh I've just read the docs again and writing the scissors
registers is listed as a 2D engine command. The setup in ISYNC_CNTL should
in theory take care of 2D/3D synchronisation anyway.
Maybe the correct place to insert the wait is just before the second and
subsequent calls to emit_clip_rect from emit_packet3_cliprect. Heh. If I'm
reading the docs correctly it doesn't actually matter *what* I wait for,
since the setup code has told the card to stall any WAIT_UNTILs as long as
either the 2D or 3D engines are busy, so it would seem there is actually no
difference between WAIT_2D_IDLE(), WAIT_3D_IDLE() and WAIT_UNTIL_IDLE().
That would tend to confuse my fault isolation procedures...
I'll run some more tests when I get home. Hey, at least I'm learning
something, which is always fun :-)
Tim Smith (tim@...)
Rhodians are a delicacy in the Plasteen System