#4 e1000 tx hang debug code (all releases)

closed
None
e1000
9
2012-07-17
2006-03-29
No

This patch implements a ring dump to dmesg that is
printed just before reset due to NETDEV_WATCHDOG. It
can help identify what exactly the state of the ring is
when the failure occurred. Be careful, the dump is
sizeable and based on how many descriptors you have.
Its likely that you will have to increase your kernel
ring buffer size to contain the whole thing. Another
option is to decrease the descriptor count.

Discussion

  • Jesse Brandeburg

    Logged In: YES
    user_id=631160

    In the future, please attach large messages.

    doublekm, this is interesting input, in particular, your
    first post shows some kind of data corruption.

    TX Desc ring0 dump
    T[desc] [address 63:0 ] [vl pt Sdcdt ln] [bi-

    dma ] leng ntw timestmp bi->skb
    T[0x000] 000000000000001F 0000000000000000
    000000001F422DE2 002A 0 000000000009A12C df5d1780 NTC
    T[0x001] 000000001F422C62 000000008B00002A
    000000001F422C62 002A 1 000000000009A52C dfc828e0
    T[0x002] 000000001F422B62 000000008B00002A
    000000001F422B62 002A 2 000000000009A92C df67a400

    in looking at 0x000, the address located there is completely
    invalid.

    this is the obvious cause of the TX hang, and probably
    caused a master abort when the adapter tried to access the
    data at location 0x1F

    we expected to see 000000001F422DE2 there

    what type of system do you have and who manufactured it.
    what e1000 hardware, lspci -vvv would help. output of
    dmidecode might also help.

     
  • k2m

    k2m - 2006-05-12

    Logged In: YES
    user_id=1515584

    I found where the problem comes. The problem was PCI
    bridge, not 82546EB. The PCI bridge, MV64260 was set to
    use "aggressive prefetch mode" for enhancing performance.
    And without that feature tx hang did not occur though
    performance is degraded by 20~30%.

    Anyway, thank you for your patch. It was very helpful for
    debugging.

     
  • Linas Vepstas

    Linas Vepstas - 2006-09-15

    Logged In: YES
    user_id=11545

    How about including this patch in the mainline kernel,
    but keeping it disabled (e.g. with ifdef CONFIG_E1000_DUMP)

    --linas

     
  • Auke Kok

    Auke Kok - 2006-09-18

    Logged In: YES
    user_id=126698

    the patch already lives in -mm. we were hoping that that
    would be enough.

     
  • Jesse Brandeburg

    Logged In: YES
    user_id=631160
    Originator: YES

    File Added: e1000_7320_dump_ring.patch

     
  • Jesse Brandeburg

    Logged In: YES
    user_id=631160
    Originator: YES

    File Added: e1000_7435_dump.patch

     
  • Jesse Brandeburg

    Logged In: YES
    user_id=631160
    Originator: YES

    File Added: e1000_765_dump.patch

     
  • david graham

    david graham - 2008-01-09

    Logged In: YES
    user_id=573181
    Originator: NO

    File Added: e1000-7612_dump.patch

     
  • david graham

    david graham - 2008-01-09

    updated dump ring for 7.6.12

     
  • Jesse Brandeburg

    Logged In: YES
    user_id=631160
    Originator: YES

    File Added: e1000_7.6.15_debug_dump.patch

     
  • david graham

    david graham - 2009-06-15

    File Added : e1000-8-0-13.debug_dump.patch.

     
  • Jesse Brandeburg

    • status: open --> closed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks