Menu

skbuff_head_cache poison

Help
2010-01-26
2012-09-28
  • Andrew Frezell

    Andrew Frezell - 2010-01-26

    I'm still investigating this issue, but it appears to be a socket buffer
    corruption after a free inside the linx kernel module on the when
    sending/receiving a linx message. The free happens in linx_ioctl_receive. I'll
    post more information as I find it. The one thing I would like to do is try
    and map the corrupted buffer to some structure (to see what is writing the 02
    00 00 00). What structure would this be?

    [ 2694.693409] =============================================================================
    [ 2694.694402] BUG skbuff_head_cache: Poison overwritten 
    [ 2694.694402] -----------------------------------------------------------------------------
    [ 2694.694402]
    [ 2694.694402] INFO: 0xed78983c-0xed78983f. First byte 0x2 instead of 0x6b 
    [ 2694.694402] INFO: Allocated in __alloc_skb+0x29/0x114 age=1085 cpu=1 pid=2961 
    [ 2694.694402] INFO: Freed in __kfree_skb+0x67/0x6a age=947 cpu=0 pid=2012 
    [ 2694.694402] INFO: Slab 0xc1b9fd40 objects=32 used=20 fp=0xed789800 flags=0x400040c3 
    [ 2694.694402] INFO: Object 0xed789800 @offset=6144 fp=0xed789e00 
    [ 2694.694402] 
    [ 2694.694402] Bytes b4 0xed7897f0:  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
    [ 2694.694402]   Object 0xed789800:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789810:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789820:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789830:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 02 00 00 00 kkkkkkkkkkkk....
    [ 2694.694402]   Object 0xed789840:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789850:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789860:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789870:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789880:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed789890:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    [ 2694.694402]   Object 0xed7898a0:  6b 6b 6b 6b 6b 6b 6b a5                         kkkkkkk?        
    [ 2694.694402]  Redzone 0xed7898a8:  bb bb bb bb                                     >>>>>>>>            
    [ 2694.694402]  Padding 0xed7898d0:  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ 
    [ 2694.694402]  Padding 0xed7898e0:  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ 
    [ 2694.694402]  Padding 0xed7898f0:  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
    [ 2694.694402] Pid: 4, comm: ksoftirqd/0 Tainted: P           2.6.30.4 #1
    [ 2694.694402] Call Trace:
    [ 2694.694402]  [<c017c2cb>] print_trailer+0xe1/0xe9 
    [ 2694.694402]  [<c017c354>] check_bytes_and_report+0x81/0xa4 
    [ 2694.694402]  [<c017c44b>] check_object+0xa4/0x18b 
    [ 2694.694402]  [<c017cc5c>] alloc_debug_processing+0x81/0x144 
    [ 2694.694402]  [<c017db75>] __slab_alloc+0x174/0x1c8 
    [ 2694.694402]  [<c0470d08>] ? __alloc_skb+0x29/0x114 
    [ 2694.694402]  [<c017dda0>] kmem_cache_alloc+0x5b/0x90 
    [ 2694.694402]  [<c0470d08>] ? __alloc_skb+0x29/0x114 
    [ 2694.694402]  [<c0470d08>] ? __alloc_skb+0x29/0x114 
    [ 2694.694402]  [<c0470d08>] __alloc_skb+0x29/0x114 
    [ 2694.694402]  [<c04714c4>] skb_copy+0x2e/0x83 
    [ 2694.694402]  [<c047d6ef>] neigh_timer_handler+0x23f/0x2a0 
    [ 2694.694402]  [<c0130525>] run_timer_softirq+0x13b/0x19b 
    [ 2694.694402]  [<c047d4b0>] ? neigh_timer_handler+0x0/0x2a0 
    [ 2694.694402]  [<c047d4b0>] ? neigh_timer_handler+0x0/0x2a0 
    [ 2694.694402]  [<c012cfa2>] __do_softirq+0x8f/0x12d 
    [ 2694.694402]  [<c012d066>] do_softirq+0x26/0x2b 
    [ 2694.694402]  [<c012d0b8>] ksoftirqd+0x4d/0xb7 
    [ 2694.694402]  [<c012d06b>] ? ksoftirqd+0x0/0xb7 
    [ 2694.694402]  [<c0138d32>] kthread+0x45/0x6b 
    [ 2694.694402]  [<c0138ced>] ? kthread+0x0/0x6b 
    [ 2694.694402]  [<c01033ef>] kernel_thread_helper+0x7/0x10 
    [ 2694.694402] FIX skbuff_head_cache: Restoring 0xed78983c-0xed78983f=0x6b 
    [ 2694.694402] 
    [ 2694.694402] FIX skbuff_head_cache: Marking all objects used
    
     
  • Andrew Frezell

    Andrew Frezell - 2010-01-26

    From what I can tell, the corrupted field lies in the sb_buff->cb (control
    buffer) which maps to the linx_skb_cb structure. I'm not sure if my dummied up
    sk_buff is the right size, but if it is, then the corrupted field seems to be
    linx_skb_cb->type being set to LINX_TYPE_REMOTE after the free.

     
  • Andrew Frezell

    Andrew Frezell - 2010-01-29

    I added some asserts to check if linx_skb_cb->type value matches 0x6b6b6b6b
    (fill value after object freed). I then re-ran the test and got the following
    assert:

    [ 4594.143927] LINX_ASSERT@linx_do_sendmsg(1290)
    [ 4594.144836] ------------[ cut here ]------------
    [ 4594.145789] kernel BUG at /localdata/afrezell/oneos/base/build/sim5430/linux/chassis/i486-sim/sim5430/modules/linx/net/linx/af_linx.c:1290!
    [ 4594.145789] invalid opcode: 0000 [#1] SMP
    [ 4594.145789] LTT NESTING LEVEL : 0
    [ 4594.145789] last sysfs file: /sys/class/i2c-dev/i2c-11/dev
    [ 4594.145789] Modules linked in: driver_common w1com(P) bctrl(P) iofpga(P) linx linx_eth_cm chassdrv(P) smdrv ctm_stsxdrv(P) dunedrv(P) gnepd(P) tgsdrv(P) cmfandrv(P) [last unloaded: driver_common]
    [ 4594.145789]
    [ 4594.145789] Pid: 0, comm: swapper Tainted: P           (2.6.30.4 #1)
    [ 4594.145789] EIP: 0060:[<f8b92221>] EFLAGS: 00010292 CPU: 1
    [ 4594.145789] EIP is at linx_do_sendmsg+0x2d5/0x37a [linx]
    [ 4594.145789] EAX: 00000034 EBX: e5f41220 ECX: ffffffff EDX: 016e5000
    [ 4594.145789] ESI: 000000cc EDI: 00000000 EBP: f707fdf8 ESP: f707fdd4
    [ 4594.145789]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
    [ 4594.145789] Process swapper (pid: 0, ti=f707e000 task=f7042900 task.ti=f707e000)
    [ 4594.145789] Stack:
    [ 4594.145789]  f8ba0650 f8b9fafc 0000050a 000000cc 00000002 e5f41200 f640d580 f6408980
    [ 4594.145789]  00010009 f707fe2c f8b8f84d f6408980 00010009 f640d580 00010028 00000002
    [ 4594.145789]  000000cc e5f41200 00000001 00010028 00000002 00010009 f707fe7c f8b98218
    [ 4594.145789] Call Trace:
    [ 4594.145789]  [<f8b8f84d>] ? ipc_send_signal+0x170/0x229 [linx]
    [ 4594.145789]  [<f8b98218>] ? lnhcb_deliver+0xe8c/0xeaa [linx]
    [ 4594.145789]  [<c01224f9>] ? default_wake_function+0xb/0xd
    [ 4594.145789]  [<f8be5570>] ? rx_tasklet_recv+0x121/0x12c [linx_eth_cm]
    [ 4594.145789]  [<f8be572e>] ? rx_tasklet+0x8f/0xc6 [linx_eth_cm]
    [ 4594.145789]  [<c012c844>] ? tasklet_action+0x8c/0xee
    [ 4594.145789]  [<c012cfa2>] ? __do_softirq+0x8f/0x12d
    [ 4594.145789]  [<c012d066>] ? do_softirq+0x26/0x2b
    [ 4594.145789]  [<c012d14b>] ? irq_exit+0x29/0x57
    [ 4594.145789]  [<c0103e69>] ? do_IRQ+0x6d/0x83
    [ 4594.145789]  [<c0103009>] ? common_interrupt+0x29/0x30
    [ 4594.145789]  [<c0107c89>] ? default_idle+0x5b/0x94
    [ 4594.145789]  [<c0101c31>] ? cpu_idle+0x44/0x5f
    [ 4594.145789]  [<c05129af>] ? start_secondary+0xc7/0xc9
    [ 4594.145789] Code: 01 81 2c 02 00 00 89 d7 11 91 30 02 00 00 81 7b 1c 6b 6b 6b 6b 75 1b 68 0a 05 00 00 68 fc fa b9 f8 68 50 06 ba f8 e8 ec 28 98 c7 <0f> 0b 83 c4 0c eb fe 8b 55 10 8b 82 c4 01 00 00 89 43 1c 83 ba
    [ 4594.145789] EIP: [<f8b92221>] linx_do_sendmsg+0x2d5/0x37a [linx] SS:ESP 0068:f707fdd4
    [ 4594.190828] ---[ end trace a9e7fc199d9f7469 ]---
    

    The line of code that trigged this is in af_linx.c:

    /* This is so linx_recvmsg() knows if the sender was a local or
    
     * remote socket when counting local/remote received bytes. */
    LINX_ASSERT(cb->type != 0x6b6b6b6b);
    cb->type = linx_sk(from)->type;
    

    I'm not sure how the message is being freed, but I'm guessing it has something
    to do with the message being added to the skb_queue_tail and then we write
    into the message...which doesn't make too much sense.

    Thoughts? Suggestions?

    Thanks in advance,
    Drew

     

Log in to post a comment.

MongoDB Logo MongoDB