Thread: [r300] VB lockup found and fixed

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi everybody,

As reported earlier, I had a perfectly repeatable lockup in VB mode that=20
always happened after the exact same number of frames in glxgears. I can't=
=20
explain everything about the lockup, mostly because I still don't know what=
=20
the two registers in the begin3d/end3d sequence actually mean, but here's=20
what I know:

It turns out that after the first 4 DMA buffers had been used to completion=
,=20
r300FlushCmdBuf() was called from r300RefillCurrentDmaRegion(). This only=20
caused simple state setting commands as well as an upload of the current=20
vertex program into the VAP. There was no rendering going on, and neither=20
the begin3d nor the end3d sequence was part of the commands that were sent=
=20
to the card.
However for some reason, it was this sequence that caused the lockup.

This leads me to believe that there's somehow more "magic" to the=20
begin3d/end3d sequence than just cache control as I originally assumed (or=
=20
maybe it *is* cache control, but there's something weird going on in=20
connection with it, I simply don't know).

In any case, what I did is *always* emit the begin3d sequence at the top of=
=20
r300_do_cp_cmdbuf and end3d at the bottom of r300_do_cp_cmdbuf (it is also=
=20
emitted in the case of an error). This works for me, I can run glxgears for=
=20
several minutes, even doing some stuff that sometimes tends to produce=20
lockups without any problems.

Please, everybody, get the latest CVS (anonymous will take some time to=20
catch up...) and test vertex buffer mode with it (go to r300_run_render()=20
in r300_render.c and change the #if so that r300_vb_run_render() is=20
called). I want to be really sure that this fixes it for other people as=20
well (after all, there may be other causes for lockups that haven't occured=
=20
on my machine yet), and that there are no regressions for those who already=
=20
had working VB mode.

Once we can be fairly certain that VB mode is stable (i.e. crash and=20
lockup-free), let's talk about removing any mention of the begin3d and=20
end3d sequence from the userspace driver. This is really far too subtle an=
=20
issue to allow userspace to mess with it. This counts for the X server as=20
well - if anybody feels like implementing Render acceleration, which I=20
doubt at this stage, please leave the begin3d/end3d handling to the kernel=
=20
module, as it's the only instance that really knows what's going on.

cu,
Nicolai

Thread: [r300] VB lockup found and fixed

dri-devel