#153 mga lockup because of drmGetLock with flag=0

Jeff Hartmann
MGA OpenGL (56)
Iftikhar Rathore

Latest CVS and XFree86-4.0.1
X locks up when using dri (OpenGL application). Most of the tome X can be restarted by loging in remotely, but getting the original VT back needs rebooting

The lockup can be replicated by repeatedly (not concurrently) executing very simple code, I have put the code at:
(detailed stack trace for X is at the end of this report, trace for the application at lockup is at ftp://news.icns.com/pub/lesson2.app.out )

After the lockup I can always see see that last recognizable thing X did was:

#0 0x400e5f24 in __ioctl ()
#1 0xbffff5cc in ?? ()
#2 0x864758c in drmGetLock (fd=6, context=1, flags=0) at xf86drm.c:714

Notice the value of drmLockFlags as 0 defined in xf86drm.h. And instead of going to:
#1 xf86ioctl (fd=6, request=??, argp=??) at libc_wrapper.c:43
it ends up at

#1 0xbffff5cc in ?? ()

The stack trace of the Xserver

#0 0x400e5f24 in __ioctl ()
#1 0xbffff5cc in ?? ()
#2 0x864758c in drmGetLock (fd=6, context=1, flags=0) at xf86drm.c:714
#3 0x85e155e in DRILock (pScreen=0x8709db0, flags=0) at dri.c:1589
#4 0x854b4ec in MGAWakeupHandler (screenNum=0, wakeupData=0x0, result=4294967295, pReadmask=0x81b94c0) at mga_wrap.c:75
#5 0x85e0c5d in DRIWakeupHandler (wakeupData=0x0, result=-1, pReadmask=0x81b94c0) at dri.c:1081
#6 0x80ab17c in WakeupHandler (result=-1, pReadmask=0x81b94c0) at dixutils.c:459
#7 0x80c380f in WaitForSomething (pClientsReady=0xbffff8e0) at WaitFor.c:354
#8 0x80a51c4 in Dispatch () at dispatch.c:382
#9 0x80b49e8 in main (argc=4, argv=0xbffffd64) at main.c:429
#10 0x400531eb in __libc_start_main (main=0x80b4530 <main>, argc=4,
argv=0xbffffd64, init=0x806a99c <_init>, fini=0x815ff6c <_fini>,
rtld_fini=0x4000a610 <_dl_fini>, stack_end=0xbffffd5c)
at ../sysdeps/generic/libc-start.c:90


  • Daryll Strauss
    Daryll Strauss

    • assigned_to: nobody --> jhartmann
  • Hi
    Please tell me how I can help. I am trying to debug it a little more
    bt from lesson2 program mentioned before (that can reproduce the bug) is given below (killing the process now releases the Xserver lockup)

    #0 0x40275b54 in __ioctl ()
    #1 0x404978f8 in __DTOR_END__ ()
    #2 0x40484d43 in mga_get_buffer_ioctl (mmesa=0x8072038) at mgaioctl.c:112
    #3 0x40485a7d in mgaAllocVertexDwords (mmesa=0x8072038, dwords=24)
    at mgaioctl.c:532
    #4 0x4047dea0 in triangle (ctx=0x804cc48, e0=3, e1=4, e2=5, pv=3)
    at mgatritmp.h:7
    #5 0x40452a66 in render_vb_poly_raw (VB=0x805ec98, start=3, count=6, parity=0)
    at render_tmp.h:216
    #6 0x4045427c in gl_render_vb (VB=0x805ec98) at vbrender.c:696
    #7 0x403ec8ef in gl_run_pipeline (VB=0x805ec98) at pipeline.c:493
    #8 0x40455fea in gl_execute_cassette (ctx=0x804cc48, IM=0x8062f58)
    at vbxform.c:957
    #9 0x403940b0 in gl_cva_compile_cassette (ctx=0x804cc48, IM=0x8062f58)
    at cva.c:764
    #10 0x404544b3 in gl_maybe_transform_vb (IM=0x8062f58) at vbxform.c:74
    #11 0x404544fe in gl_flush_vb (ctx=0x804cc48, where=0x4048e7b0 "glTranslate")
    at vbxform.c:92
    #12 0x403ea902 in _mesa_Translatef (x=3, y=0, z=0) at matrix.c:1377
    #13 0x8048dcb in DrawGLScene ()
    #14 0x40110f80 in glutMainLoop ()
    #15 0x401e19cb in __libc_start_main (main=0x8048e6c <main>, argc=1,
    argv=0xbffffd14, init=0x8048934 <_init>, fini=0x8048f4c <_fini>,
    ---Type <return> to continue, or q <return> to quit---
    rtld_fini=0x4000ae60 <_dl_fini>, stack_end=0xbffffd0c)
    at ../sysdeps/generic/libc-start.c:92

    Please tell me how I can help by debugging more

  • I used 2 client programs for this testing. One was the program lesson2 (as in my bug report) it puts a window with a rectangle and a triangle in it (with or without using glutFullScreen() ) it is at
    I'll mention the second client program a bit later

    1) I started running lesson2 inside gdb from a telnet session, I kept executing it until that lockup occured

    2) I hit ctrl-c and then continue it again, the lockup remains, the program stays in the same ioctl. X stays locked up

    3) I exited gdb killing the client, which released the X lockup now all 2D stuff works perfectly but all OpenGL clients now result in the same lockup (I will refer to this state as corrupted state)

    4) In the corrupted state I tried a second client program, this program is the simple and first glut example (example 1.2) from Red OpenGL Programming Guide that puts a rectangle in a small window, for reference I have put the source for that example at:

    5) I was running ex1_2 through gdb in the corrupted state when I noticed from the stack trace, that it seemed that the stuff it is trying to render is the stuff that was in the pipeline when the first lockup occured, which is the stuff from client lesson2

    6) I kept executing ex1_2 repeatedly and finally saw the window with a traingle and a rectangle exactly the way lesson2 was supposed to render (instead of just one small window with one rectangle from ex1_2) and then my machine crashed completely.

    3) Restarting X while in the corrupted state gets everything back to normal (uncorrupted state).

  • Apparently the lockup is not the bug. The lockup is just a result of corruption elswhere (in the pipeline?).

  • Jan Sechser
    Jan Sechser

    im only a user playing frequently quake3.
    i found a way to restore the system from the lookup (from a remote machine:
    killall -TRAP quake3.x86
    killall -ILL quake3.x86
    this kills it with little side effects. konsole is working
    just the xserver doesn't reset the screen resolution and the
    mouse isn't working.
    Besides, in the quake3fortress mod it crashes rather frequently (approx. every 20 minutes) while in
    quake3UrbanTerror practically not at all.
    I hope this very vague information can help you at all.

  • Hi
    James Matthews' patch submitted to the DRI-devel list fixes the crash for me. Even though it might not be the fix and it is only delaying the timing (waiting for texture download to complete) but it fixed the problem for me. I have been trying to crash it and I couldnt.
    The patch is simple and is as follows:

    --- mgaioctl.c~ Wed Aug 23 10:54:52 2000
    +++ mgaioctl.c Wed Aug 23 10:54:37 2000
    @@ -565,6 +565,8 @@
    fprintf(stderr, "mgaFireILoad idx %d ofs 0x%x length %d\n",
    mmesa->iload_buffer->idx, (int)offset, (int)length );

    + mgaUpdateLock( mmesa, DRM_LOCK_QUIESCENT | DRM_LOCK_FLUSH);
    mga_iload_dma_ioctl( mmesa, offset, length );

  • Correction!, I meant the patch makes the engine wait for completion of what it is doing before the texture is uploaded.

  • I'm getting this lockup too. Rather inconvenient if you think about it. Has anyone got any authoritive answers about what is causing this? I heard it was some kind of hardware issue.. In any case, I think that bugs like this should have been fixed long ago considering precision insight is actually being funded to develop DRI. What is the world coming too when a free project like GLX is stable and something like DRI is not?

  • Gareth Hughes
    Gareth Hughes

    • status: open --> closed-out-of-date