From: Leif D. <lde...@re...> - 2002-07-04 22:34:18
|
After merging the trunk into the mach64 branch, I get a segfault in the Xserver in DRIClipNotify (dereferences a null pointer when trying to call a wrapper function, I think) if direct rendering is disabled after DRIFinishScreenInit is called, e.g. if the init of the drm kernel module fails. I tested this with Rage128 by making the cce_init return non-zero and I get the same thing. Was there a recent change in libdri.a that would be causing this? -- Leif Delgass http://www.retinalburn.net |
From: Jens O. <je...@tu...> - 2002-07-05 16:47:23
|
Leif Delgass wrote: > After merging the trunk into the mach64 branch, I get a segfault in the > Xserver in DRIClipNotify (dereferences a null pointer when trying to call > a wrapper function, I think) if direct rendering is disabled after > DRIFinishScreenInit is called, e.g. if the init of the drm kernel module > fails. I tested this with Rage128 by making the cce_init return non-zero > and I get the same thing. Was there a recent change in libdri.a that > would be causing this? Leif, I spent a little time this morning looking at this. There have been a few minor changes to dri.c in the last month. It's possible that one of these is biting you: ---------------------------- revision 1.40 date: 2002/06/12 15:50:27; author: keithw; state: Exp; lines: +16 -0 merged tcl-0-0-branch ---------------------------- revision 1.39 date: 2002/06/02 16:00:44; author: mdaenzer; state: Exp; lines: +1 -1 fixes for big endian in general and powerpc in particular ---------------------------- revision 1.38 date: 2002/05/28 13:45:11; author: jensowen; state: Exp; lines: +4 -0 bump clipstamp before destroying drawable ---------------------------- However, I think you may be tickling a latent bug in the DRI. It's possible that all the other drives have just avoided this bug so far. I looked at DRICloseScreen and I don't see that the DRIClipNotify wrapper is being removed. There are other unwraps missing as well. Can you send me a back trace from a static debuggable server? Let me know if you need help building this. -- /\ Jens Owen / \/\ _ je...@tu... / \ \ \ Steamboat Springs, Colorado |
From: Leif D. <lde...@re...> - 2002-07-05 18:28:18
|
On Fri, 5 Jul 2002, Jens Owen wrote: [snip] > However, I think you may be tickling a latent bug in the DRI. It's > possible that all the other drives have just avoided this bug so far. > > I looked at DRICloseScreen and I don't see that the DRIClipNotify > wrapper is being removed. There are other unwraps missing as well. > > Can you send me a back trace from a static debuggable server? Let me > know if you need help building this. Could you tell me how to build a static server or point me to a HOWTO? Meanwhile, here's a backtrace from the X server built from the branch. It looks like the ClipNotify wrapper is being called when pDRIPriv is null, though I'm not sure why I wouldn't have run into this before... Program received signal SIGSEGV, Segmentation fault. DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 1732 if(pDRIPriv->wrap.ClipNotify) { (gdb) bt #0 DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 #1 0x080c9009 in MapWindow (pWin=0x85d3a60, client=0x81d56a8) at window.c:2864 #2 0x080c5ee8 in InitRootWindow (pWin=0x85d3a60) at window.c:522 #3 0x080bf39c in main (argc=4, argv=0xbffff9d4, envp=0xbffff9e8) at main.c:439 #4 0x40072647 in __libc_start_main (main=0x80bee9c <main>, argc=4, ubp_av=0xbffff9d4, init=0x806cc08 <_init>, fini=0x8174c80 <_fini>, rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff9cc) at ../sysdeps/generic/libc-start.c:129 (gdb) info locals pWin = 0x85d3a60 pScreen = 0x85d3748 pDRIPriv = 0x0 pDRIDrawablePriv = 0x0 -- Leif Delgass http://www.retinalburn.net |
From: Jens O. <je...@tu...> - 2002-07-05 18:53:54
Attachments:
host.def
|
Leif Delgass wrote: > On Fri, 5 Jul 2002, Jens Owen wrote: > > [snip] > > >>However, I think you may be tickling a latent bug in the DRI. It's >>possible that all the other drives have just avoided this bug so far. >> >>I looked at DRICloseScreen and I don't see that the DRIClipNotify >>wrapper is being removed. There are other unwraps missing as well. >> >>Can you send me a back trace from a static debuggable server? Let me >>know if you need help building this. >> > > Could you tell me how to build a static server or point me to a HOWTO? The xc/config/cf/host.def in the DRI tree is setup to easily modified to build a debuggable server. Attached is a copy of a modified host.def file I used for debugging an i810 problem. You'll probably need to add the mach64 driver to these options. > Meanwhile, here's a backtrace from the X server built from the branch. It > looks like the ClipNotify wrapper is being called when pDRIPriv is null, > though I'm not sure why I wouldn't have run into this before... > > Program received signal SIGSEGV, Segmentation fault. > DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 > 1732 if(pDRIPriv->wrap.ClipNotify) { > (gdb) bt > #0 DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 > #1 0x080c9009 in MapWindow (pWin=0x85d3a60, client=0x81d56a8) at window.c:2864 > #2 0x080c5ee8 in InitRootWindow (pWin=0x85d3a60) at window.c:522 > #3 0x080bf39c in main (argc=4, argv=0xbffff9d4, envp=0xbffff9e8) at main.c:439 > #4 0x40072647 in __libc_start_main (main=0x80bee9c <main>, argc=4, > ubp_av=0xbffff9d4, init=0x806cc08 <_init>, fini=0x8174c80 <_fini>, > rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff9cc) > at ../sysdeps/generic/libc-start.c:129 > (gdb) info locals > pWin = 0x85d3a60 > pScreen = 0x85d3748 > pDRIPriv = 0x0 > pDRIDrawablePriv = 0x0 Yes, it looks like the DRI initialization process was started, causing the DRI wrappers to be put in place; then, something caused DRI initialization to fail, but the failure handling code does not remove the wrappers. I believe I need to unwrap the DRI routines in DRICloseScreen. I'd like to fix this case and ask you to test with what you've got since it's hard to test these unusual failure cases when everythings working properly. It's still curious no other drivers have had this problem. Either nobody else has gone done these failure cases, or I'm barking up the wrong tree. Can you verify that we are indeed calling DRICloseScreen by putting a breakpoint at that routine and sending me a backtrace at that point? Thanks, Jens -- /\ Jens Owen / \/\ _ je...@tu... / \ \ \ Steamboat Springs, Colorado |
From: Leif D. <lde...@re...> - 2002-07-05 19:16:16
|
On Fri, 5 Jul 2002, Jens Owen wrote: > Leif Delgass wrote: > > > On Fri, 5 Jul 2002, Jens Owen wrote: > > > > [snip] > > > > > >>However, I think you may be tickling a latent bug in the DRI. It's > >>possible that all the other drives have just avoided this bug so far. > >> > >>I looked at DRICloseScreen and I don't see that the DRIClipNotify > >>wrapper is being removed. There are other unwraps missing as well. > >> > >>Can you send me a back trace from a static debuggable server? Let me > >>know if you need help building this. > >> > > > > Could you tell me how to build a static server or point me to a HOWTO? > > > The xc/config/cf/host.def in the DRI tree is setup to easily modified to > build a debuggable server. Attached is a copy of a modified host.def > file I used for debugging an i810 problem. You'll probably need to add > the mach64 driver to these options. OK, I'll try this. I think you're right that we need to add the GlxBuiltIn.. option for mach64. > > Meanwhile, here's a backtrace from the X server built from the branch. It > > looks like the ClipNotify wrapper is being called when pDRIPriv is null, > > though I'm not sure why I wouldn't have run into this before... > > > > Program received signal SIGSEGV, Segmentation fault. > > DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 > > 1732 if(pDRIPriv->wrap.ClipNotify) { > > (gdb) bt > > #0 DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 > > #1 0x080c9009 in MapWindow (pWin=0x85d3a60, client=0x81d56a8) at window.c:2864 > > #2 0x080c5ee8 in InitRootWindow (pWin=0x85d3a60) at window.c:522 > > #3 0x080bf39c in main (argc=4, argv=0xbffff9d4, envp=0xbffff9e8) at main.c:439 > > #4 0x40072647 in __libc_start_main (main=0x80bee9c <main>, argc=4, > > ubp_av=0xbffff9d4, init=0x806cc08 <_init>, fini=0x8174c80 <_fini>, > > rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff9cc) > > at ../sysdeps/generic/libc-start.c:129 > > (gdb) info locals > > pWin = 0x85d3a60 > > pScreen = 0x85d3748 > > pDRIPriv = 0x0 > > pDRIDrawablePriv = 0x0 > > > Yes, it looks like the DRI initialization process was started, causing > the DRI wrappers to be put in place; then, something caused DRI > initialization to fail, but the failure handling code does not remove > the wrappers. > > I believe I need to unwrap the DRI routines in DRICloseScreen. I'd like > to fix this case and ask you to test with what you've got since it's > hard to test these unusual failure cases when everythings working properly. > > It's still curious no other drivers have had this problem. Either > nobody else has gone done these failure cases, or I'm barking up the > wrong tree. It's pretty easy to test if you just change the return value of the driver's drm init function to return non-zero. For example, I tried this in the r128 driver in r128_do_init_cce (changed the last line to return -1), and it suffers the same problem (the backtrace was the same). > Can you verify that we are indeed calling DRICloseScreen by putting a > breakpoint at that routine and sending me a backtrace at that point? I know it's called because I see the messages in the X log about removing the signal handler, kernel context, SAREA, etc. It's called as part of the DRI driver specific CloseScreen (ATIDRICloseScreen) when the kernel init fails (which is after DRIFinishScreenInit is called). In fact, the entire X init seems to work without a hitch (I see all the normal messages in the X log after "Direct rendering disabled" up to XINPUT) until the root window is initialized. -- Leif Delgass http://www.retinalburn.net |
From: Jens O. <je...@tu...> - 2002-07-05 22:39:01
Attachments:
dri_unwrap.patch
|
Leif Delgass wrote: > On Fri, 5 Jul 2002, Jens Owen wrote: > > >>Leif Delgass wrote: >> >> >>>On Fri, 5 Jul 2002, Jens Owen wrote: >>> >>>[snip] >>> >>> >>> >>>>However, I think you may be tickling a latent bug in the DRI. It's >>>>possible that all the other drives have just avoided this bug so far. >>>> >>>>I looked at DRICloseScreen and I don't see that the DRIClipNotify >>>>wrapper is being removed. There are other unwraps missing as well. >>>> >>>>Can you send me a back trace from a static debuggable server? Let me >>>>know if you need help building this. >>>> >>>> >>>Could you tell me how to build a static server or point me to a HOWTO? >>> >> >>The xc/config/cf/host.def in the DRI tree is setup to easily modified to >>build a debuggable server. Attached is a copy of a modified host.def >>file I used for debugging an i810 problem. You'll probably need to add >>the mach64 driver to these options. >> > > OK, I'll try this. I think you're right that we need to add the > GlxBuiltIn.. option for mach64. If my memory serves me, that's just for 3D clients, and it doesn't work anymore...so I wouldn't worry about that option. However, you will want to add mach64 to the other driver lists in this file. > >>>Meanwhile, here's a backtrace from the X server built from the branch. It >>>looks like the ClipNotify wrapper is being called when pDRIPriv is null, >>>though I'm not sure why I wouldn't have run into this before... >>> >>>Program received signal SIGSEGV, Segmentation fault. >>>DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 >>>1732 if(pDRIPriv->wrap.ClipNotify) { >>>(gdb) bt >>>#0 DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 >>>#1 0x080c9009 in MapWindow (pWin=0x85d3a60, client=0x81d56a8) at window.c:2864 >>>#2 0x080c5ee8 in InitRootWindow (pWin=0x85d3a60) at window.c:522 >>>#3 0x080bf39c in main (argc=4, argv=0xbffff9d4, envp=0xbffff9e8) at main.c:439 >>>#4 0x40072647 in __libc_start_main (main=0x80bee9c <main>, argc=4, >>> ubp_av=0xbffff9d4, init=0x806cc08 <_init>, fini=0x8174c80 <_fini>, >>> rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff9cc) >>> at ../sysdeps/generic/libc-start.c:129 >>>(gdb) info locals >>>pWin = 0x85d3a60 >>>pScreen = 0x85d3748 >>>pDRIPriv = 0x0 >>>pDRIDrawablePriv = 0x0 >>> >> >>Yes, it looks like the DRI initialization process was started, causing >>the DRI wrappers to be put in place; then, something caused DRI >>initialization to fail, but the failure handling code does not remove >>the wrappers. >> >>I believe I need to unwrap the DRI routines in DRICloseScreen. I'd like >>to fix this case and ask you to test with what you've got since it's >>hard to test these unusual failure cases when everythings working properly. >> >>It's still curious no other drivers have had this problem. Either >>nobody else has gone done these failure cases, or I'm barking up the >>wrong tree. >> > > It's pretty easy to test if you just change the return value of the > driver's drm init function to return non-zero. For example, I tried this > in the r128 driver in r128_do_init_cce (changed the last line to return > -1), and it suffers the same problem (the backtrace was the same). Yes, it's easy for force specific failures; but I don't think developers and users have been hitting these cases in normal testing scenarios. Otherwise, we'd have caught this during the 3 years this extensions been in use. >>Can you verify that we are indeed calling DRICloseScreen by putting a >>breakpoint at that routine and sending me a backtrace at that point? >> > > I know it's called because I see the messages in the X log about removing > the signal handler, kernel context, SAREA, etc. It's called as part of > the DRI driver specific CloseScreen (ATIDRICloseScreen) when the kernel > init fails (which is after DRIFinishScreenInit is called). In fact, the > entire X init seems to work without a hitch (I see all the normal messages > in the X log after "Direct rendering disabled" up to XINPUT) until the > root window is initialized. Okay, try the attached patch. I think I'll do more than this, but it would be great if you could test just this, first. -- /\ Jens Owen / \/\ _ je...@tu... / \ \ \ Steamboat Springs, Colorado |
From: Leif D. <lde...@re...> - 2002-07-05 23:07:25
|
On Fri, 5 Jul 2002, Jens Owen wrote: [...] > >>The xc/config/cf/host.def in the DRI tree is setup to easily modified to > >>build a debuggable server. Attached is a copy of a modified host.def > >>file I used for debugging an i810 problem. You'll probably need to add > >>the mach64 driver to these options. > >> > > > > OK, I'll try this. I think you're right that we need to add the > > GlxBuiltIn.. option for mach64. > > > If my memory serves me, that's just for 3D clients, and it doesn't work > anymore...so I wouldn't worry about that option. However, you will want > to add mach64 to the other driver lists in this file. You're right, it's for building a libGL with the driver statically linked. I did find where the build problem is, though. In xc/lib/GL/GL/Imakefile, when I added GlxBuiltInMach64, based on the r128 and Radeon, I was getting "No rule to make target ../../../lib/GL/mesa/dri/?*.o" in xc/lib/GL/GL. It looks like the xc/lib/GL/mesa/dri directory was removed and dri_util.c was added to xc/lib/GL/dri. I don't know if this is the right solution, but I took a guess and was able to get it to build with this change: Index: Imakefile =================================================================== RCS file: /cvsroot/dri/xc/xc/lib/GL/GL/Imakefile,v retrieving revision 1.1.1.2.12.1 diff -u -r1.1.1.2.12.1 Imakefile --- Imakefile 27 Jun 2002 22:04:03 -0000 1.1.1.2.12.1 +++ Imakefile 5 Jul 2002 23:01:58 -0000 @@ -65,10 +65,10 @@ MESADOBJS = $(COREMESADOBJS) $(MESA_ASM_DOBJS) MESAPOBJS = $(COREMESAPOBJS) $(MESA_ASM_POBJS) - DRIMESAOBJS = $(GLXLIBSRC)/mesa/dri/?*.o -DRIMESAUOBJS = $(GLXLIBSRC)/mesa/dri/unshared/?*.o -DRIMESADOBJS = $(GLXLIBSRC)/mesa/dri/debugger/?*.o -DRIMESAPOBJS = $(GLXLIBSRC)/mesa/dri/profiled/?*.o + DRIMESAOBJS = $(GLXLIBSRC)/dri/dri_util.o +DRIMESAUOBJS = $(GLXLIBSRC)/dri/unshared/dri_util.o +DRIMESADOBJS = $(GLXLIBSRC)/dri/debugger/dri_util.o +DRIMESAPOBJS = $(GLXLIBSRC)/dri/profiled/dri_util.o #if GlxUseBuiltInDRIDriver #include "../mesa/src/drv/common/Imakefile.inc" > >>>Meanwhile, here's a backtrace from the X server built from the branch. It > >>>looks like the ClipNotify wrapper is being called when pDRIPriv is null, > >>>though I'm not sure why I wouldn't have run into this before... > >>> > >>>Program received signal SIGSEGV, Segmentation fault. > >>>DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 > >>>1732 if(pDRIPriv->wrap.ClipNotify) { > >>>(gdb) bt > >>>#0 DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 > >>>#1 0x080c9009 in MapWindow (pWin=0x85d3a60, client=0x81d56a8) at window.c:2864 > >>>#2 0x080c5ee8 in InitRootWindow (pWin=0x85d3a60) at window.c:522 > >>>#3 0x080bf39c in main (argc=4, argv=0xbffff9d4, envp=0xbffff9e8) at main.c:439 > >>>#4 0x40072647 in __libc_start_main (main=0x80bee9c <main>, argc=4, > >>> ubp_av=0xbffff9d4, init=0x806cc08 <_init>, fini=0x8174c80 <_fini>, > >>> rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff9cc) > >>> at ../sysdeps/generic/libc-start.c:129 > >>>(gdb) info locals > >>>pWin = 0x85d3a60 > >>>pScreen = 0x85d3748 > >>>pDRIPriv = 0x0 > >>>pDRIDrawablePriv = 0x0 The backtrace from the static server was the same. BTW, this might help others trying to debug with a dynamic server: I removed 'Load "GLcore"' from my XF86Config, because I saw that it was being reloaded by the glx module anyway. Before I did that, I was getting a backtrace that was wrong -- it said something about mipmaps, so I was suspicious :) > >>Yes, it looks like the DRI initialization process was started, causing > >>the DRI wrappers to be put in place; then, something caused DRI > >>initialization to fail, but the failure handling code does not remove > >>the wrappers. > >> > >>I believe I need to unwrap the DRI routines in DRICloseScreen. I'd like > >>to fix this case and ask you to test with what you've got since it's > >>hard to test these unusual failure cases when everythings working properly. > >> > >>It's still curious no other drivers have had this problem. Either > >>nobody else has gone done these failure cases, or I'm barking up the > >>wrong tree. > >> > > > > It's pretty easy to test if you just change the return value of the > > driver's drm init function to return non-zero. For example, I tried this > > in the r128 driver in r128_do_init_cce (changed the last line to return > > -1), and it suffers the same problem (the backtrace was the same). > > > Yes, it's easy for force specific failures; but I don't think developers > and users have been hitting these cases in normal testing scenarios. > Otherwise, we'd have caught this during the 3 years this extensions been > in use. It's true that the more stable drivers wouldn't hit this very often, but this bug wasn't present before the merge of the trunk into the mach64 branch, so it's only been around for 4 months max. The kernel init would frequently fail when testing DMA, and the server never segfaulted before. That's why I thought it was odd, since the wrapper functions in question were left in place before as well. Maybe they weren't getting called before? > >>Can you verify that we are indeed calling DRICloseScreen by putting a > >>breakpoint at that routine and sending me a backtrace at that point? > >> > > > > I know it's called because I see the messages in the X log about removing > > the signal handler, kernel context, SAREA, etc. It's called as part of > > the DRI driver specific CloseScreen (ATIDRICloseScreen) when the kernel > > init fails (which is after DRIFinishScreenInit is called). In fact, the > > entire X init seems to work without a hitch (I see all the normal messages > > in the X log after "Direct rendering disabled" up to XINPUT) until the > > root window is initialized. > > Okay, try the attached patch. I think I'll do more than this, but it > would be great if you could test just this, first. OK, thanks. I let you know how it goes. -- Leif Delgass http://www.retinalburn.net |
From: Jens O. <je...@tu...> - 2002-07-05 23:27:23
|
Leif Delgass wrote: > On Fri, 5 Jul 2002, Jens Owen wrote: > > [...] > >>>>The xc/config/cf/host.def in the DRI tree is setup to easily modified to >>>>build a debuggable server. Attached is a copy of a modified host.def >>>>file I used for debugging an i810 problem. You'll probably need to add >>>>the mach64 driver to these options. >>>> >>>> >>>OK, I'll try this. I think you're right that we need to add the >>>GlxBuiltIn.. option for mach64. >>> >> >>If my memory serves me, that's just for 3D clients, and it doesn't work >>anymore...so I wouldn't worry about that option. However, you will want >>to add mach64 to the other driver lists in this file. >> > > You're right, it's for building a libGL with the driver statically linked. > I did find where the build problem is, though. In xc/lib/GL/GL/Imakefile, > when I added GlxBuiltInMach64, based on the r128 and Radeon, I was getting > "No rule to make target ../../../lib/GL/mesa/dri/?*.o" in xc/lib/GL/GL. > It looks like the xc/lib/GL/mesa/dri directory was removed and dri_util.c > was added to xc/lib/GL/dri. I don't know if this is the right solution, > but I took a guess and was able to get it to build with this change: > > Index: Imakefile > =================================================================== > RCS file: /cvsroot/dri/xc/xc/lib/GL/GL/Imakefile,v > retrieving revision 1.1.1.2.12.1 > diff -u -r1.1.1.2.12.1 Imakefile > --- Imakefile 27 Jun 2002 22:04:03 -0000 1.1.1.2.12.1 > +++ Imakefile 5 Jul 2002 23:01:58 -0000 > @@ -65,10 +65,10 @@ > MESADOBJS = $(COREMESADOBJS) $(MESA_ASM_DOBJS) > MESAPOBJS = $(COREMESAPOBJS) $(MESA_ASM_POBJS) > > - DRIMESAOBJS = $(GLXLIBSRC)/mesa/dri/?*.o > -DRIMESAUOBJS = $(GLXLIBSRC)/mesa/dri/unshared/?*.o > -DRIMESADOBJS = $(GLXLIBSRC)/mesa/dri/debugger/?*.o > -DRIMESAPOBJS = $(GLXLIBSRC)/mesa/dri/profiled/?*.o > + DRIMESAOBJS = $(GLXLIBSRC)/dri/dri_util.o > +DRIMESAUOBJS = $(GLXLIBSRC)/dri/unshared/dri_util.o > +DRIMESADOBJS = $(GLXLIBSRC)/dri/debugger/dri_util.o > +DRIMESAPOBJS = $(GLXLIBSRC)/dri/profiled/dri_util.o > > #if GlxUseBuiltInDRIDriver > #include "../mesa/src/drv/common/Imakefile.inc" Check with Keith to see if this stuff is worth fixing. If so, great...if not, we ought to remove it as cruft. >>>>>Meanwhile, here's a backtrace from the X server built from the branch. It >>>>>looks like the ClipNotify wrapper is being called when pDRIPriv is null, >>>>>though I'm not sure why I wouldn't have run into this before... >>>>> >>>>>Program received signal SIGSEGV, Segmentation fault. >>>>>DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 >>>>>1732 if(pDRIPriv->wrap.ClipNotify) { >>>>>(gdb) bt >>>>>#0 DRIClipNotify (pWin=0x85d3a60, dx=0, dy=0) at dri.c:1732 >>>>>#1 0x080c9009 in MapWindow (pWin=0x85d3a60, client=0x81d56a8) at window.c:2864 >>>>>#2 0x080c5ee8 in InitRootWindow (pWin=0x85d3a60) at window.c:522 >>>>>#3 0x080bf39c in main (argc=4, argv=0xbffff9d4, envp=0xbffff9e8) at main.c:439 >>>>>#4 0x40072647 in __libc_start_main (main=0x80bee9c <main>, argc=4, >>>>> ubp_av=0xbffff9d4, init=0x806cc08 <_init>, fini=0x8174c80 <_fini>, >>>>> rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff9cc) >>>>> at ../sysdeps/generic/libc-start.c:129 >>>>>(gdb) info locals >>>>>pWin = 0x85d3a60 >>>>>pScreen = 0x85d3748 >>>>>pDRIPriv = 0x0 >>>>>pDRIDrawablePriv = 0x0 >>>>> > > The backtrace from the static server was the same. BTW, this might help > others trying to debug with a dynamic server: I removed 'Load "GLcore"' > from my XF86Config, because I saw that it was being reloaded by the glx > module anyway. Before I did that, I was getting a backtrace that was > wrong -- it said something about mipmaps, so I was suspicious :) Hmmm, I was wondering how you got such nice line numbers from the back trace of a dynamic server. I'm also guessing you have the version of gdb with the XFree86 module support. >>>>Yes, it looks like the DRI initialization process was started, causing >>>>the DRI wrappers to be put in place; then, something caused DRI >>>>initialization to fail, but the failure handling code does not remove >>>>the wrappers. >>>> >>>>I believe I need to unwrap the DRI routines in DRICloseScreen. I'd like >>>>to fix this case and ask you to test with what you've got since it's >>>>hard to test these unusual failure cases when everythings working properly. >>>> >>>>It's still curious no other drivers have had this problem. Either >>>>nobody else has gone done these failure cases, or I'm barking up the >>>>wrong tree. >>>> >>>> >>>It's pretty easy to test if you just change the return value of the >>>driver's drm init function to return non-zero. For example, I tried this >>>in the r128 driver in r128_do_init_cce (changed the last line to return >>>-1), and it suffers the same problem (the backtrace was the same). >>> >> >>Yes, it's easy for force specific failures; but I don't think developers >>and users have been hitting these cases in normal testing scenarios. >>Otherwise, we'd have caught this during the 3 years this extensions been >>in use. >> > > It's true that the more stable drivers wouldn't hit this very often, > but this bug wasn't present before the merge of the trunk into the mach64 > branch, so it's only been around for 4 months max. The kernel init would > frequently fail when testing DMA, and the server never segfaulted before. > That's why I thought it was odd, since the wrapper functions in question > were left in place before as well. Maybe they weren't getting called > before? Exactly. > >>>>Can you verify that we are indeed calling DRICloseScreen by putting a >>>>breakpoint at that routine and sending me a backtrace at that point? >>>> >>>> >>>I know it's called because I see the messages in the X log about removing >>>the signal handler, kernel context, SAREA, etc. It's called as part of >>>the DRI driver specific CloseScreen (ATIDRICloseScreen) when the kernel >>>init fails (which is after DRIFinishScreenInit is called). In fact, the >>>entire X init seems to work without a hitch (I see all the normal messages >>>in the X log after "Direct rendering disabled" up to XINPUT) until the >>>root window is initialized. >>> >>Okay, try the attached patch. I think I'll do more than this, but it >>would be great if you could test just this, first. >> > > OK, thanks. I let you know how it goes. -- /\ Jens Owen / \/\ _ je...@tu... / \ \ \ Steamboat Springs, Colorado |
From: Leif D. <lde...@re...> - 2002-07-06 00:10:57
Attachments:
dri_unwrap2.patch
|
On Fri, 5 Jul 2002, Jens Owen wrote: > Leif Delgass wrote: > > > The backtrace from the static server was the same. BTW, this might help > > others trying to debug with a dynamic server: I removed 'Load "GLcore"' > > from my XF86Config, because I saw that it was being reloaded by the glx > > module anyway. Before I did that, I was getting a backtrace that was > > wrong -- it said something about mipmaps, so I was suspicious :) > > > Hmmm, I was wondering how you got such nice line numbers from the back > trace of a dynamic server. I'm also guessing you have the version of > gdb with the XFree86 module support. Oh yeah, I keep forgetting that I installed that. ;) [...] > >>Okay, try the attached patch. I think I'll do more than this, but it > >>would be great if you could test just this, first. > >> > > > > OK, thanks. I let you know how it goes. With one change, this fixes the problem. The AdjustFrame wrapper was already dealt with at the beginning of the function and pDRIPriv->wrap.AdjustFrame was set to NULL, so pScrn->AdjustFrame was getting NULL when the wrapper was removed the second time. I just removed the first bit of code and kept yours grouped with the other new "unwrappings." The modified patch is attached. I'll apply this to the mach64 branch, but I'll let you patch the trunk. Thanks for your help! -- Leif Delgass http://www.retinalburn.net |
From: Jens O. <je...@tu...> - 2002-07-06 15:33:56
Attachments:
dri_unwrap_2.patch
|
Leif Delgass wrote: >>> On Fri, 5 Jul 2002, Jens Owen wrote: >>>>Okay, try the attached patch. I think I'll do more than this, but it >>>>would be great if you could test just this, first. Okay, I have attached a more robust patch. Can you try this on your branch? > > I'll apply this to the > mach64 branch, but I'll let you patch the trunk. I'll apply this new patch to the trunk if it works okay for you. -- /\ Jens Owen / \/\ _ je...@tu... / \ \ \ Steamboat Springs, Colorado |
From: Leif D. <lde...@re...> - 2002-07-06 17:10:58
|
Jens, This works after fixing one thing in this section from DRICloseScreen: if (pDRIPriv->wrap.AdjustFrame) { - ScrnInfoPtr pScrn = xf86Screens[pScreen->myNum]; - pScrn->AdjustFrame = pDRIPriv->wrap.AdjustFrame; - pDRIPriv->wrap.AdjustFrame = NULL; + ScrnInfoPtr pScrn = xf86Screens[pScreen->myNum]; + pScrn->AdjustFrame = pDRIPriv->wrap.AdjustFrame; + pScrn->AdjustFrame = NULL; ^^^^^^^^^^^^^^^^^^ } That last line should change from: pScrn->AdjustFrame = NULL; to: pDRIPriv->wrap.AdjustFrame = NULL; This is line 452 in the patched dri.c. Other than that it looks good. -Leif On Sat, 6 Jul 2002, Jens Owen wrote: > Leif Delgass wrote: > > >>> On Fri, 5 Jul 2002, Jens Owen wrote: > >>>>Okay, try the attached patch. I think I'll do more than this, but it > >>>>would be great if you could test just this, first. > > > Okay, I have attached a more robust patch. Can you try this on your branch? > > > > > > I'll apply this to the > > mach64 branch, but I'll let you patch the trunk. > > > I'll apply this new patch to the trunk if it works okay for you. > > -- Leif Delgass http://www.retinalburn.net |
From: Jens O. <je...@tu...> - 2002-07-06 19:39:38
|
Leif Delgass wrote: > Jens, > > This works after fixing one thing in this section from DRICloseScreen: > > if (pDRIPriv->wrap.AdjustFrame) { > - ScrnInfoPtr pScrn = xf86Screens[pScreen->myNum]; > - pScrn->AdjustFrame = pDRIPriv->wrap.AdjustFrame; > - pDRIPriv->wrap.AdjustFrame = NULL; > + ScrnInfoPtr pScrn = xf86Screens[pScreen->myNum]; > + pScrn->AdjustFrame = pDRIPriv->wrap.AdjustFrame; > + pScrn->AdjustFrame = NULL; > ^^^^^^^^^^^^^^^^^^ > } > > That last line should change from: > > pScrn->AdjustFrame = NULL; > > to: > > pDRIPriv->wrap.AdjustFrame = NULL; > > This is line 452 in the patched dri.c. Other than that it looks good. Leif, As you can probably tell by now, I'm not able to test this path. I've tried on the Radeon driver, but it, like most drivers has a lot of failure cases that don't appear to be working properly. What does work is when the DRI comes up cleanly, or when hitting one of the early fairly cases...lack of AGP, etc. However, if I force a failure in RADEONDRIFinishScreenInit, which is after most of the DRI resources have been setup, I can hang the system. The fact that you are exercising these paths in the mach64 driver is a good thing. I made your change to my last patch and have checked it into the trunk. Regards, Jens -- /\ Jens Owen / \/\ _ je...@tu... / \ \ \ Steamboat Springs, Colorado |