#395 *_freelist_get returns NULL!

open
nobody
5
2001-11-20
2001-11-20
Dan Merillat
No

wolf2 test triggers it, and it completely locks either the card or the
driver... killing wolf2/X via ssh, then trying to start the X-server (via ssh)
hangs the kernel on the radeon probe. Commenting out the printk (as was suggested
in other lists) only cuts down on kernel CPU usage, it doesn't help the problem.

I looked into replacing the kernel DRI module with the ones from other branches,
no go. (Not that there's a significant difference between CVS and the stock
2.4.14 kernel)

I'm willing to help debug this, especially if we can find a quicker test-case
then loading wolf2test and starting a game. First guess would be that we're
very ungraceful when we run out of texture or framebuffer memory on-card,
and wolf2 is the first game complex enough to hit that limit instantly.
(quake3 hits it eventually while playing and locks)

I note previous bug reports here indicate they only had a few messages ( less
then 200)... I'm scrolling thousands:

Nov 19 19:43:13 vulpine kernel: [drm:radeon_freelist_get] *ERROR* returning NULL!
Nov 19 19:43:44 vulpine last message repeated 1115 times
Nov 19 19:44:45 vulpine last message repeated 2264 times
Nov 19 19:45:46 vulpine last message repeated 2157 times
Nov 19 19:46:10 vulpine last message repeated 823 times
Nov 19 19:47:48 vulpine init: Switching to runlevel: 6

... as long as I don't try to touch the radeon after lockup, I can ssh in and
reboot.

Discussion

  • Dan Merillat
    Dan Merillat
    2001-11-20

    Logged In: YES
    user_id=2110

    oh yea, this might help:

    Nov 19 19:40:10 vulpine kernel: radeonfb: ref_clk=2700, ref_div=60, xclk=15000 from BIOS
    Nov 19 19:40:10 vulpine kernel: CRTC_H_TOTAL_DISP = 0x4f0063, H_SYNC = 0x8c02a2
    Nov 19 19:40:10 vulpine kernel: CRTC_V_TOTAL_DISP = 0x1df020c, V_SYNC = 0x8201ea
    Nov 19 19:40:10 vulpine kernel: PPLL_DIV_3 = 0x301bf, PPLL_REF_DIV = 0x3c
    Nov 19 19:40:10 vulpine kernel: DDA_CONFIG = 0x10802fc, DDA_ON_OFF = 0x3f05390
    Nov 19 19:40:10 vulpine kernel: Console: switching to colour frame buffer device 80x30
    Nov 19 19:40:10 vulpine kernel: radeonfb: ATI Radeon QD DDR SGRAM 32 MB
    Nov 19 19:40:10 vulpine kernel: radeonfb: CRT port CRT monitor connected

    Nov 19 19:40:10 vulpine kernel: Linux agpgart interface v0.99 (c) Jeff Hartmann
    Nov 19 19:40:10 vulpine kernel: agpgart: Maximum main memory to use for agp memory: 203M
    Nov 19 19:40:10 vulpine kernel: agpgart: Detected Via Apollo Pro KT133 chipset
    Nov 19 19:40:10 vulpine kernel: agpgart: AGP aperture is 64M @ 0xe8000000
    Nov 19 19:40:10 vulpine kernel: [drm] AGP 0.99 on VIA Apollo KT133 @ 0xe8000000 64MB
    Nov 19 19:40:10 vulpine kernel: [drm] Initialized radeon 1.1.1 20010405 on minor 0

    I tried setting the AGPART to 32, no go. Any other suggested workarounds?

    --Dan

     
  • Logged In: YES
    user_id=29685

    I'm also seeing this with BX chipset, ATI Radeon VE QY,
    XFree86 from CVS (Nov.24th) playing with Q3A.
    I never lock though and only saw the message 10 times.

     
  • Logged In: NO

    I'm getting this exact thing also with wolf2 - radeon 7000
    (X says it's a VE QY) - on a BX board. AGP 1x or 2x and I
    tried the suggestion of turning off BIOS and video caching,
    nothing helps. Q3A will give me 10-20 of these messages
    during a game but no other problems. No other game gives
    any messages at all (oddly the first wolf test gives no
    messages at all - go figure...)

    I looked at it a little more closely, in my case wolf2 loops
    forever on EAGAIN during the time it is getting the drm:
    radeon_freelist_get error. So I put a printk in the places
    that return EAGAIN and the actual call to freelist_get
    that is returning NULL is in radeon_cp_get_buffers in
    radeon_cp.c (the bit of code that goes:)
    for ( i = d->granted_count ; i < d->request_count ; i++ ) {
    buf = radeon_freelist_get( dev );
    if ( !buf ) return -EAGAIN;

    I can't get my mind around why it runs out of buffers and
    can't then ever free any up again though...

    This is with X4.1 and "radeon 1.1.1 20010405" from the
    2.4.15-pre kernels and RH 2.4.7 and 2.4.9 (doesn't look like
    2.4.16 has changed anything here either...)

     
  • Eric Anholt
    Eric Anholt
    2003-01-12

    Logged In: YES
    user_id=7685

    freelist_get returning NULL errors just means a hang has
    occurred (the used buffers aren't getting freed while the
    client keeps asking for more). Are you still having hangs
    with recent CVS?

     
  • Matt DeLuco
    Matt DeLuco
    2003-03-28

    Logged In: YES
    user_id=10331

    Anyone interested in fixing this? It's only ~3 years old.
    I'm not trying to be contemptuous here, but this is a pretty
    serious bug to have lasted this long. Especially to have
    gone unassigned or even unanswered.

    Anyhow, while playing Quake 3 (1.32b) my system locks up
    nearly every time, at no specific point during the game. To
    regain control of the system, a reboot is necessary. I have
    encountered several occasions where I was able to terminate
    X using ctrl-alt-backspace, and I'd end up back in the
    shell. Quake 3 woudl still be running, and I'd have to kill
    the process. However, I can't say if this is related to the
    same problem.

    After rebooting, I checked dmesg, XFree86.0.log, messages,
    and syslog. The first three showed nothing, the third
    showed the following:

    "mobius kernel: [drm:radeon_freelist_get] *ERROR* returning
    NULL!
    mobius last message repeated 1811 times
    mobius last message repeated 3604 times
    mobius last message repeated 3605 times

    My guess is it didn't stop logging until I rebooted. I
    don't know much about how the DRI drivers work, but I think
    it's safe to say that when radeon_freelist_get() returns
    NULL, something isn't handled properly and an infinite loop
    ensues.

    I found the code for the driver online, and it seems the
    function radeon_freelist_get() only has two other returns,
    nested in some for loops and if statements. They both
    return drm_buf_t *buf.
    (http://www.atomised.org/docs/XFree86-4.2.1/bsd_2drm_2kernel_2radeon_2radeon__cp_8c.html#a27)

    I'll state my system specs, but they're probably irrelevant,
    seeing as how long this problem has been around. Let me
    know if I've forgotten anything.

    Linux Kernel 2.4.20
    XFree86 4.2.1 (4.3 has the problem, see
    http://dri.sourceforge.net/faq/faq_display.phtml?id=49\)
    GCC 2.95.3
    glibc 2.2.5
    ATI Radeon 64MB DDR VI/VO
    ABIT KG7-Raid mobo (AMD761 north bridge, VIA VT82C686B
    southbridge)
    AMD 1.4ghz 266fsb Thunderbird
    Lots of Crucial pc2100 ddr ram ;)

    From what I found using Google, this seems to be a general
    problem for all Radeon setups, and not specific to Quake 3,
    or even specific to games at all.

    That's about all I could find out.

     
  • Matt DeLuco
    Matt DeLuco
    2003-03-28

    Logged In: YES
    user_id=10331

    Anyone interested in fixing this? It's only ~3 years old.
    I'm not trying to be contemptuous here, but this is a pretty
    serious bug to have lasted this long. Especially to have
    gone unassigned or even unanswered.

    Anyhow, while playing Quake 3 (1.32b) my system locks up
    nearly every time, at no specific point during the game. To
    regain control of the system, a reboot is necessary. I have
    encountered several occasions where I was able to terminate
    X using ctrl-alt-backspace, and I'd end up back in the
    shell. Quake 3 woudl still be running, and I'd have to kill
    the process. However, I can't say if this is related to the
    same problem.

    After rebooting, I checked dmesg, XFree86.0.log, messages,
    and syslog. The first three showed nothing, the third
    showed the following:

    "mobius kernel: [drm:radeon_freelist_get] *ERROR* returning
    NULL!
    mobius last message repeated 1811 times
    mobius last message repeated 3604 times
    mobius last message repeated 3605 times

    My guess is it didn't stop logging until I rebooted. I
    don't know much about how the DRI drivers work, but I think
    it's safe to say that when radeon_freelist_get() returns
    NULL, something isn't handled properly and an infinite loop
    ensues.

    I found the code for the driver online, and it seems the
    function radeon_freelist_get() only has two other returns,
    nested in some for loops and if statements. They both
    return drm_buf_t *buf.
    (http://www.atomised.org/docs/XFree86-4.2.1/bsd_2drm_2kernel_2radeon_2radeon__cp_8c.html#a27)

    I'll state my system specs, but they're probably irrelevant,
    seeing as how long this problem has been around. Let me
    know if I've forgotten anything.

    Linux Kernel 2.4.20
    XFree86 4.2.1 (4.3 has the problem, see
    http://dri.sourceforge.net/faq/faq_display.phtml?id=49\)
    GCC 2.95.3
    glibc 2.2.5
    ATI Radeon 64MB DDR VI/VO
    ABIT KG7-Raid mobo (AMD761 north bridge, VIA VT82C686B
    southbridge)
    AMD 1.4ghz 266fsb Thunderbird
    Lots of Crucial pc2100 ddr ram ;)

    From what I found using Google, this seems to be a general
    problem for all Radeon setups, and not specific to Quake 3,
    or even specific to games at all.

    That's about all I could find out.

     
  • Matt DeLuco
    Matt DeLuco
    2003-03-28

    Logged In: YES
    user_id=10331

    Sorry for the double post.. reloaded my browser.
    Sorry for the entire post, in fact. Somehow I didn't notice
    the comments. I assumed the "Add a Comment" would have been
    below all the comments.

    There is in fact work being done on this.