#364 Another MGA crash

closed-fixed
nobody
7
2001-07-25
2001-07-24
Han-Wen Nienhuys
No

X server crash on initialization (or Virtual screen
switch) of the OpenGL app. This happens with my own 3D
program very often, and sporadically with
xdemos/glxgears. It seems that running glxgears first,
and then my own program reduces the frequency of the
crashes. It is not 100% reproducible.

System:

- DELL precision 220
- i820 chipset
- Matrox G450 (/proc/pci says: G400 rev 130) / 32 mb.
- 1 Ghz PIII
=======
- RedHat 7.1 (Xfree86 4.0.3-5)
- Kernel 2.4.7

In the logs I find:

Jul 24 13:45:08 meddo kernel: [drm:mga_dma_quiescent]
*ERROR* irqs: 2 wanted 0
Jul 24 13:45:08 meddo kernel: [drm:mga_dma_quiescent]
*ERROR* lockup
Jul 24 13:45:11 meddo kernel: [drm:mga_dma_quiescent]
*ERROR* irqs: 2 wanted 0
Jul 24 13:45:11 meddo kernel: [drm:mga_dma_quiescent]
*ERROR* lockup
Jul 24 13:45:14 meddo kernel: [drm:mga_dma_quiescent]
*ERROR* irqs: 2 wanted 0
Jul 24 13:45:14 meddo kernel: [drm:mga_dma_quiescent]
*ERROR* lockup
**********

Jul 24 13:54:28 meddo kernel: [drm:mga_fire_primary]
*ERROR* num_dwords == 0 when dispatched
Jul 24 13:54:28 meddo last message repeated 5 times
************

Jul 24 15:01:08 meddo kernel: [drm:mga_fire_primary]
*ERROR* num_dwords == 0 when dispatched
Jul 24 15:01:08 meddo last message repeated 9 times

***********

Jul 24 15:18:07 meddo kernel: [drm:mga_fire_primary]
*ERROR* irqs: 17838 wanted 0
Jul 24 15:18:07 meddo kernel: [drm:mga_fire_primary]
*ERROR* lockup (wait)

***********

These logs are from both the Matrox driver (dl'd
23/7/01) and the driver from XFree86 4.0.3, I believe.
The entries also say " Initialized mga 2.0.1 20000928
on minor 63"

Discussion

    • labels: 101124 --> MGA X Server
     
  • Logged In: YES
    user_id=161998

    More information: the above log snippets are in
    /var/log/messages,
    after pressing the SAK key. They seem to be caused by the X
    server
    being killed.

    When I run the program with strace, it seems that the
    program (being
    both the app and the X-server hang (but do not dump core)).
    The last
    lines in the strace file are all ioctls, about 6600 of them.
    They all look like

    ioctl(4, 0x40046441, 0xbffff144) = 0
    ioctl(4, 0x4008642a, 0xbfff13f0) = 0
    ioctl(4, 0x4008642a, 0xbfff1500) = 0
    ioctl(4, 0x4008642a, 0xbfffef40) = 0
    ioctl(4, 0x4008642b, 0xbfff1418) = 0
    ioctl(4, 0x4008642b, 0xbfff1528) = 0
    ioctl(4, 0x4008642b, 0xbfff7458) = 0
    ioctl(4, 0x40086445, 0xbfff1410) = 0
    ioctl(4, 0x40086445, 0xbfff1420) = 0
    ioctl(4, 0x40086445, 0xbfff1520) = 0
    ioctl(4, 0x40086445, 0xbfff1530) = 0
    ioctl(4, 0x40086445, 0xbfff7450) = 0
    ioctl(4, 0x40086445, 0xbfff7560) = 0
    ioctl(4, 0x40086445, 0xbffff170) = 0
    ioctl(4, 0x400c6444, 0xbfffee90) = 0
    ioctl(4, 0x400c6444, 0xbfffefa0) = 0
    ioctl(4, 0xc0286429, 0xbfffedf0) = 0
    ioctl(4, 0xc0286429, 0xbfffef00) = 0

    the last lines are

    ioctl(4, 0x40086445, 0xbfff1520) = 0
    ioctl(4, 0x40086445, 0xbfff1530) = 0
    ioctl(4, 0x40086445, 0xbfff7560) = 0
    ioctl(4, 0x40086445, 0xbffff170) = 0
    ioctl(4, 0x40046441 <unfinished ...>

    in between is also

    ioctl(3, FIONREAD, [0]) = 0
    gettimeofday({995989936, 330949}, NULL) = 0

    (but those may be caused by my own application)

    Any more thoughts on how to solve/investigate this problem?

     
    • priority: 5 --> 7
     
  • Logged In: YES
    user_id=161998

    Upgraded to XF 4.1.0 and latest DRM (as an aside: how do you
    find out the version number of DRM? a /proc/dri/ entry would
    be nic)

     
    • status: open --> closed-fixed