From: Michel <mi...@da...> - 2003-04-09 00:02:39
Attachments:
vblank.diff
|
I discovered a problem with LIBGL_THROTTLE_REFRESH: it always starts with 0 for vbl_seq, so if there have been more than 2^23 vertical blanks since the IRQ was installed (takes less than 39 hours at 60 Hz), DRM(vblank_wait) times out, and the Mesa driver aborts. As the texmem to trunk merge is imminent, I'm working on the texmem code to address this. How about this patch? It determines the current sequence number and only actually waits when necessary. As for the DRM side, I guess it doesn't make sense for DRM(vblank_wait) to allow waiting for longer than the timeout? :) Or is there a legitimate use for waiting for more than 3 seconds using vertical blanks? Looking forward to comments, -- Earthling Michel Dänzer \ Debian (powerpc), XFree86 and DRI developer Software libre enthusiast \ http://svcs.affero.net/rm.php?r=daenzer |
From: Ian R. <id...@us...> - 2003-04-09 18:53:07
|
Michel D=E4nzer wrote: > I discovered a problem with LIBGL_THROTTLE_REFRESH: it always starts > with 0 for vbl_seq, so if there have been more than 2^23 vertical blank= s > since the IRQ was installed (takes less than 39 hours at 60 Hz), > DRM(vblank_wait) times out, and the Mesa driver aborts. >=20 > As the texmem to trunk merge is imminent, I'm working on the texmem cod= e > to address this. How about this patch? It determines the current > sequence number and only actually waits when necessary. >=20 > As for the DRM side, I guess it doesn't make sense for DRM(vblank_wait) > to allow waiting for longer than the timeout? :) Or is there a > legitimate use for waiting for more than 3 seconds using vertical > blanks? We can't squish VBLANK_FLAG_INTERVAL and VBLANK_FLAG_THROTTLE together.=20 If the GLX API version is less than 20030317, then=20 gc->driContext.swap_interval (accessed on line 250 of your patched=20 version) does not exist. Try using that client-side driver with a=20 libGL.so from the trunk. :) By potentially eliminating the do_wait call any time curr_MSC >=20 target_MSC, we also lose synchronization and can cause tearing. Perhaps=20 the test should be 'if (original_seq !=3D 0)'. Then we can only have a=20 tear once every 2^32 frames. I have another question for you. Is there any way that the driver can=20 get a signal from the DRM when the target MSC has occured? It *sucks*=20 to have to sleep to wait to swap buffers. There's quite a bit of=20 rendering that could be queued in parallel with the wait. The other option is to change the way the swap ioctls work. Have the=20 ioctls internally queue the swap. Then the client-side driver would=20 have to poll the kernel to make sure the swap is done. Both routes seem=20 to have their own pitfalls... |
From: Keith W. <ke...@tu...> - 2003-04-09 21:02:32
|
> The other option is to change the way the swap ioctls work. Have the > ioctls internally queue the swap. Then the client-side driver would > have to poll the kernel to make sure the swap is done. Both routes seem > to have their own pitfalls... This polling is already in place in the client drivers, one way or another. It might take some finessing, but it might be possible to implement this without clientside changes. Keith |
From: Michel <mi...@da...> - 2003-04-09 21:10:26
|
On Mit, 2003-04-09 at 20:51, Ian Romanick wrote: > Michel Dänzer wrote: > > I discovered a problem with LIBGL_THROTTLE_REFRESH: it always starts > > with 0 for vbl_seq, so if there have been more than 2^23 vertical blanks > > since the IRQ was installed (takes less than 39 hours at 60 Hz), > > DRM(vblank_wait) times out, and the Mesa driver aborts. > > > > As the texmem to trunk merge is imminent, I'm working on the texmem code > > to address this. How about this patch? It determines the current > > sequence number and only actually waits when necessary. > > > > As for the DRM side, I guess it doesn't make sense for DRM(vblank_wait) > > to allow waiting for longer than the timeout? :) Or is there a > > legitimate use for waiting for more than 3 seconds using vertical > > blanks? No opinion on this BTW? > We can't squish VBLANK_FLAG_INTERVAL and VBLANK_FLAG_THROTTLE together. > If the GLX API version is less than 20030317, then > gc->driContext.swap_interval (accessed on line 250 of your patched > version) does not exist. Try using that client-side driver with a > libGL.so from the trunk. :) I did, and while it didn't work as expected, it didn't crash and burn either, but I understand now that was just luck. :) > By potentially eliminating the do_wait call any time curr_MSC > > target_MSC, we also lose synchronization and can cause tearing. There's no difference. In the cases where this code doesn't wait, do_wait would return immediately in the old code. > Perhaps the test should be 'if (original_seq != 0)'. Then we can only > have a tear once every 2^32 frames. That was my first idea as well, but why resort to a hack when we can have a clean solution. > I have another question for you. Is there any way that the driver can > get a signal from the DRM when the target MSC has occured? Yes, at least on Linux - I think it's not implemented on BSD yet, so there'd have to be a code path for sleeping as well. > It *sucks* to have to sleep to wait to swap buffers. There's quite a > bit of rendering that could be queued in parallel with the wait. How would you queue it though? You can't do it on the hardware, can you? Or maybe you can have the hardware wait for something that can be triggered on the swap? > The other option is to change the way the swap ioctls work. Have the > ioctls internally queue the swap. Then the client-side driver would > have to poll the kernel to make sure the swap is done. Maybe we could even add a flag to the vertical blank ioctl? Then you could have it swap buffers and then send you a signal on vertical blank. :) The question is whether the swap can be done from the interrupt handler. I guess at least writing the frame age would have to be moved to the bottom half. -- Earthling Michel Dänzer \ Debian (powerpc), XFree86 and DRI developer Software libre enthusiast \ http://svcs.affero.net/rm.php?r=daenzer |
From: Eric A. <et...@lc...> - 2003-04-09 21:41:37
|
On Wed, 2003-04-09 at 14:10, Michel Dänzer wrote: > On Mit, 2003-04-09 at 20:51, Ian Romanick wrote: > > > I have another question for you. Is there any way that the driver can > > get a signal from the DRM when the target MSC has occured? > > Yes, at least on Linux - I think it's not implemented on BSD yet, so > there'd have to be a code path for sleeping as well. I wrote code for it for FreeBSD once, but I didn't have anything to test it with and was too lazy to write a test program, so I removed it until someone comes up with an app to use to test it with :) -- Eric Anholt et...@lc... http://people.freebsd.org/~anholt/ anholt@FreeBSD.org |
From: Ian R. <id...@us...> - 2003-04-09 22:12:41
|
Michel D=E4nzer wrote: > On Mit, 2003-04-09 at 20:51, Ian Romanick wrote: >=20 >>Michel D=E4nzer wrote: >> >>>I discovered a problem with LIBGL_THROTTLE_REFRESH: it always starts >>>with 0 for vbl_seq, so if there have been more than 2^23 vertical blan= ks >>>since the IRQ was installed (takes less than 39 hours at 60 Hz), >>>DRM(vblank_wait) times out, and the Mesa driver aborts. >>> >>>As the texmem to trunk merge is imminent, I'm working on the texmem co= de >>>to address this. How about this patch? It determines the current >>>sequence number and only actually waits when necessary. >>> >>>As for the DRM side, I guess it doesn't make sense for DRM(vblank_wait= ) >>>to allow waiting for longer than the timeout? :) Or is there a >>>legitimate use for waiting for more than 3 seconds using vertical >>>blanks? >=20 > No opinion on this BTW? Sorry. I forgot to reply to that in the first message. It seems=20 reasonable to limit the amount of time it can block. >>By potentially eliminating the do_wait call any time curr_MSC >=20 >>target_MSC, we also lose synchronization and can cause tearing. =20 >=20 > There's no difference. In the cases where this code doesn't wait, > do_wait would return immediately in the old code. >=20 >>Perhaps the test should be 'if (original_seq !=3D 0)'. Then we can onl= y=20 >>have a tear once every 2^32 frames. >=20 > That was my first idea as well, but why resort to a hack when we can > have a clean solution. Okay. The present solution seems reasonable, then. >>I have another question for you. Is there any way that the driver can=20 >>get a signal from the DRM when the target MSC has occured? =20 >=20 > Yes, at least on Linux - I think it's not implemented on BSD yet, so > there'd have to be a code path for sleeping as well. I know that the DRM can generate the signal, but how do we make sure=20 it's handled by the client-side driver and not by the application?=20 After watching some of the discussions on devel@xfree86, we do have to=20 be prepared for apps using some of these ioctls both throught the driver=20 and directly. >>It *sucks* to have to sleep to wait to swap buffers. There's quite a=20 >>bit of rendering that could be queued in parallel with the wait. >=20 > How would you queue it though? You can't do it on the hardware, can you= ? > Or maybe you can have the hardware wait for something that can be > triggered on the swap? The driver would just do everything that it could until it had to write=20 something (new state, rendering commands, etc.) to the hardware. At=20 that point it would have to block until the swap happened. If we had=20 triple buffering, it wouldn't be as much of a problem. :) In that case=20 the driver would only have to wait if there was still 2 swaps pending. A lot of apps do "simulation," then rendering. By having glXSwapBuffers=20 return immediatly, those apps could do their simulation while the=20 rendering / swapping was finishing up. That could be a big win for=20 vsycn frame rates. Of course, it would do nothing for the non-vsync=20 case. :) >>The other option is to change the way the swap ioctls work. Have the=20 >>ioctls internally queue the swap. Then the client-side driver would=20 >>have to poll the kernel to make sure the swap is done. >=20 > Maybe we could even add a flag to the vertical blank ioctl? Then you > could have it swap buffers and then send you a signal on vertical blank. > :) The question is whether the swap can be done from the interrupt > handler. I guess at least writing the frame age would have to be moved > to the bottom half. To get the right synchronization writing to the ring, we'd have to do=20 all of it in the BH, wouldn't we? It could get tricky. |
From: Michel <mi...@da...> - 2003-04-10 00:55:12
|
On Don, 2003-04-10 at 00:12, Ian Romanick wrote: > Michel Dänzer wrote: > > On Mit, 2003-04-09 at 20:51, Ian Romanick wrote: > > > >>Michel Dänzer wrote: > >> > >>>I discovered a problem with LIBGL_THROTTLE_REFRESH: it always starts > >>>with 0 for vbl_seq, so if there have been more than 2^23 vertical blanks > >>>since the IRQ was installed (takes less than 39 hours at 60 Hz), > >>>DRM(vblank_wait) times out, and the Mesa driver aborts. > >>> > >>>As the texmem to trunk merge is imminent, I'm working on the texmem code > >>>to address this. How about this patch? It determines the current > >>>sequence number and only actually waits when necessary. > >>> > >>>As for the DRM side, I guess it doesn't make sense for DRM(vblank_wait) > >>>to allow waiting for longer than the timeout? :) Or is there a > >>>legitimate use for waiting for more than 3 seconds using vertical > >>>blanks? > > > > No opinion on this BTW? > > Sorry. I forgot to reply to that in the first message. It seems > reasonable to limit the amount of time it can block. Okay, thanks. > >>By potentially eliminating the do_wait call any time curr_MSC > > >>target_MSC, we also lose synchronization and can cause tearing. > > > > There's no difference. In the cases where this code doesn't wait, > > do_wait would return immediately in the old code. > > > >>Perhaps the test should be 'if (original_seq != 0)'. Then we can only > >>have a tear once every 2^32 frames. > > > > That was my first idea as well, but why resort to a hack when we can > > have a clean solution. > > Okay. The present solution seems reasonable, then. I've been thinking a bit more about the very first swap though: right now, it will basically occur immediately, unless it's attempted before the very first vertical blank after the IRQ is enabled, which is very unlikely. :) Would it be better to wait for a vertical blank instead? > >>I have another question for you. Is there any way that the driver can > >>get a signal from the DRM when the target MSC has occured? > > > > Yes, at least on Linux - I think it's not implemented on BSD yet, so > > there'd have to be a code path for sleeping as well. > > I know that the DRM can generate the signal, but how do we make sure > it's handled by the client-side driver and not by the application? The driver can choose whichever signal it wants, and if I'm not mistaken, it can check for an existing signal handler, replace it with its own and restore it afterwards. > >>It *sucks* to have to sleep to wait to swap buffers. There's quite a > >>bit of rendering that could be queued in parallel with the wait. > > > > How would you queue it though? You can't do it on the hardware, can you? > > Or maybe you can have the hardware wait for something that can be > > triggered on the swap? > > The driver would just do everything that it could until it had to write > something (new state, rendering commands, etc.) to the hardware. At > that point it would have to block until the swap happened. If we had > triple buffering, it wouldn't be as much of a problem. :) In that case > the driver would only have to wait if there was still 2 swaps pending. > > A lot of apps do "simulation," then rendering. By having glXSwapBuffers > return immediatly, those apps could do their simulation while the > rendering / swapping was finishing up. That could be a big win for > vsycn frame rates. Ah yes, that makes sense. > >>The other option is to change the way the swap ioctls work. Have the > >>ioctls internally queue the swap. Then the client-side driver would > >>have to poll the kernel to make sure the swap is done. > > > > Maybe we could even add a flag to the vertical blank ioctl? Then you > > could have it swap buffers and then send you a signal on vertical blank. > > :) The question is whether the swap can be done from the interrupt > > handler. I guess at least writing the frame age would have to be moved > > to the bottom half. > > To get the right synchronization writing to the ring, we'd have to do > all of it in the BH, wouldn't we? It could get tricky. Right, I was only thinking of the page flipping case, which might be doable without the ring. Without page flipping, I wonder if using the ring from the bottom half would be any win at all over letting the client handle it? -- Earthling Michel Dänzer \ Debian (powerpc), XFree86 and DRI developer Software libre enthusiast \ http://svcs.affero.net/rm.php?r=daenzer |
From: Ian R. <id...@us...> - 2003-04-10 01:32:16
|
Michel D=E4nzer wrote: > On Don, 2003-04-10 at 00:12, Ian Romanick wrote: >=20 >>I know that the DRM can generate the signal, but how do we make sure=20 >>it's handled by the client-side driver and not by the application?=20 >=20 > The driver can choose whichever signal it wants, and if I'm not > mistaken, it can check for an existing signal handler, replace it with > its own and restore it afterwards. It's the classic problem of libraries using signals. The library=20 installs its signal handler, then the application, not knowing that the=20 library is using a particular signal, installs its handler. The library=20 never gets its signal, but the application gets an extra one. [snip] > Right, I was only thinking of the page flipping case, which might be > doable without the ring. Without page flipping, I wonder if using the > ring from the bottom half would be any win at all over letting the > client handle it? The advantage of doing it in the kernel isn't performance. The=20 advantage is not having to deal with the signal problems. Instead of=20 having the signal madness, we'd have a mutex of some sort associated=20 with the back-buffer. The kernel would lock the mutex on entry to the=20 ioctl, and it would unlock the mutex when the swap completes. The=20 client-side driver would only need to attempt to acquire the mutex when=20 it needs to change hardware state or submit rendering commands. If the=20 swap isn't done, the process will sleep. |
From: Michel <mi...@da...> - 2003-04-11 13:13:56
Attachments:
vblank.diff
|
On Don, 2003-04-10 at 03:32, Ian Romanick wrote: > Michel Dänzer wrote: > > On Don, 2003-04-10 at 00:12, Ian Romanick wrote: > > > >>I know that the DRM can generate the signal, but how do we make sure > >>it's handled by the client-side driver and not by the application? > > > > The driver can choose whichever signal it wants, and if I'm not > > mistaken, it can check for an existing signal handler, replace it with > > its own and restore it afterwards. > > It's the classic problem of libraries using signals. The library > installs its signal handler, then the application, not knowing that the > library is using a particular signal, installs its handler. The library > never gets its signal, but the application gets an extra one. I see, so the only possibility would be a signal that no application will ever use. No idea if that's feasible. Here's an updated patch for the original problem. The only remaining question I have is whether the early return 0 for the VBLANK_FLAG_SYNC is correct. -- Earthling Michel Dänzer \ Debian (powerpc), XFree86 and DRI developer Software libre enthusiast \ http://svcs.affero.net/rm.php?r=daenzer |
From: Ian R. <id...@us...> - 2003-04-11 14:43:32
|
Michel D=E4nzer wrote: > On Don, 2003-04-10 at 03:32, Ian Romanick wrote: >=20 >>Michel D=E4nzer wrote: >> >>>On Don, 2003-04-10 at 00:12, Ian Romanick wrote: >>> >>>>I know that the DRM can generate the signal, but how do we make sure=20 >>>>it's handled by the client-side driver and not by the application?=20 >>> >>>The driver can choose whichever signal it wants, and if I'm not >>>mistaken, it can check for an existing signal handler, replace it with >>>its own and restore it afterwards. >> >>It's the classic problem of libraries using signals. The library=20 >>installs its signal handler, then the application, not knowing that the= =20 >>library is using a particular signal, installs its handler. The librar= y=20 >>never gets its signal, but the application gets an extra one. >=20 > I see, so the only possibility would be a signal that no application > will ever use. No idea if that's feasible. IMO, that's one area where the POSIX.4 team missed something. It would=20 have been really nice to have functions like alloc_rt_signal and=20 release_rt_signal. That would go a long way to preventing problems like=20 this. Of course, going that route would have had its own problems. > Here's an updated patch for the original problem. The only remaining > question I have is whether the early return 0 for the VBLANK_FLAG_SYNC > is correct. It depends on the semantic we want. Does VBLANK_FLAG_SYNC mean "always=20 wait exactly 1 refresh" or does it mean "always wait at least 1=20 refresh"? If it means the former, then the early-return is okay. The=20 original code assumes that it means the later. This allows an=20 application to set a higher swap-interval and have it be respected when=20 VBLANK_FLAG_SYNC is set. I came up with a slightly different patch that reverts to the trunk's=20 default swap-interval and keeps the existing behavior=20 w/VBLANK_FLAG_SYNC. The first wait becomes a relative wait for either 0=20 or 1 frames depending on VBLANK_FLAG_SYNC. The second wait defaults to=20 the application's perference if VBLANK_FLAG_INTERVAL is set. Opinions? |
From: Ian R. <id...@us...> - 2003-04-11 14:45:15
Attachments:
vblank-idr.patch
|
Ian Romanick wrote: > I came up with a slightly different patch that reverts to the trunk's > default swap-interval and keeps the existing behavior > w/VBLANK_FLAG_SYNC. The first wait becomes a relative wait for either 0 > or 1 frames depending on VBLANK_FLAG_SYNC. The second wait defaults to > the application's perference if VBLANK_FLAG_INTERVAL is set. > > Opinions? I guess it would help if I actually attached the patch. Duh. |
From: Michel <mi...@da...> - 2003-04-11 15:09:22
|
On Fre, 2003-04-11 at 16:44, Ian Romanick wrote: > Ian Romanick wrote: > > > I came up with a slightly different patch that reverts to the trunk's > > default swap-interval and keeps the existing behavior > > w/VBLANK_FLAG_SYNC. The first wait becomes a relative wait for either 0 > > or 1 frames depending on VBLANK_FLAG_SYNC. The second wait defaults to > > the application's perference if VBLANK_FLAG_INTERVAL is set. > > > > Opinions? > > I guess it would help if I actually attached the patch. Duh. Indeed. :) Looks like a very elegant solution to me. PS: I still think we should conform to the specs rather than the FPS addicts by default. ;) -- Earthling Michel Dänzer \ Debian (powerpc), XFree86 and DRI developer Software libre enthusiast \ http://svcs.affero.net/rm.php?r=daenzer |