Menu

MPD crashes FreeBSD 7.2 and 8.1 after update

Help
Egor
2012-05-23
2013-03-27
  • Egor

    Egor - 2012-05-23

    Hello!
    I have a few servers with FreeBSD 7.2 or 8.1 and MPD 5.5 for a PPPoE connection. After I updated MPD to version 5.6 (I've  updated only MPD port and I've used  this patch and a patch to support the CoA RAD_CLASS attribute:

    --- ../mpd-5.6/src/radsrv.c     2011-12-21 23:58:49.000000000 +0900
    +++ ./src/radsrv.c      2012-04-02 19:02:26.106800017 +0900
    @@ -94,6 +94,7 @@
         Bund       B;
         Link       L;
         char        *tmpval;
    +       u_char  *rad_class = NULL;
         char       *username = NULL, *called = NULL, *calling = NULL, *sesid = NULL;
         char       *msesid = NULL, *link = NULL, *bundle = NULL, *iface = NULL;
         int                nasport = -1, serv_type = 0, ifindex = -1, i;
    @@ -163,6 +164,13 @@
                    Log(LG_RADIUS2, ("radsrv: Got RAD_USER_NAME: %s",
                        username));
                    break;
    +               case RAD_CLASS:
    +               tmpval = Bin2Hex(data, len);
    +               Log(LG_RADIUS2, ("radsrv: Got RAD_CLASS: %s",
    +                       tmpval));
    +               Freee(tmpval);
    +               rad_class = Mdup(MB_AUTH, data, len);
    +               break;
                case RAD_NAS_IP_ADDRESS:
                    nas_ip = rad_cvt_addr(data);
                    Log(LG_RADIUS2, ("radsrv: Got RAD_NAS_IP_ADDRESS: %s ",
    @@ -509,6 +517,8 @@
                    ACLCopy(acl_queue, &L->lcp.auth.params.acl_queue);
                    ACLCopy(acl_table, &L->lcp.auth.params.acl_table);
     #endif /* USE_IPFW */
    +               if (rad_class)
    +                       L->lcp.auth.params.class=rad_class;
     #ifdef USE_NG_BPF
                    for (i = 0; i < ACL_FILTERS; i++) {
                        ACLDestroy(L->lcp.auth.params.acl_filters[i]);
    

    After this update the servers start to reboot after a panic periodically about once a week. The reasons are different but usually it looks like:

    kgdb /boot/kernel/kernel /var/crash/vmcore.2 
    ...
    Fatal trap 18: integer divide fault while in kernel mode
    cpuid = 0; apic id = 00
    instruction pointer = 0x20:0xc4de1d73
    stack pointer           = 0x28:0xc3f92670
    frame pointer           = 0x28:0xc3f926c0
    code segment        = base 0x0, limit 0xfffff, type 0x1b
                = DPL 0, pres 1, def32 1, gran 1
    processor eflags    = interrupt enabled, resume, IOPL = 0
    current process     = 26 (em1 taskq)
    trap number     = 18
    ...
    (kgdb) list *0xc4de1d73
    0xc4de1d73 is in bpf_filter (/usr/src/sys/modules/netgraph/bpf/../../../net/bpf_filter.c:461).
    456         case BPF_ALU|BPF_MUL|BPF_K:
    457             A *= pc->k;
    458             continue;
    459 
    460         case BPF_ALU|BPF_DIV|BPF_K:
    461             A /= pc->k;
    462             continue;
    463 
    464         case BPF_ALU|BPF_AND|BPF_K:
    465             A &= pc->k;
    

    Sometimes there are other errors, but there is always bpf_filter in "where" command output of gdb. All my kernels have additional options:

    options         IPFIREWALL
    options         IPDIVERT
    options         IPFIREWALL_FORWARD
    options         NETGRAPH
    options         NETGRAPH_IPFW
    options         NETGRAPH_PPPOE
    options         NETGRAPH_IFACE
    options         DEVICE_POLLING
    options         HZ=1000
    

    And I've changed these sysctl variables:

    net.inet.icmp.icmplim=800
    net.inet.flowtable.enable=0
    net.isr.direct=1
    kern.random.sys.harvest.ethernet=0
    kern.random.sys.harvest.point_to_point=0
    kern.random.sys.harvest.interrupt=0
    net.inet.ip.fastforwarding=1
    vm.pmap.shpgperproc=2048
    net.isr.maxthreads 2
    net.isr.bindthreads 1
    

    There are about 200 users on every server. And pppoe-delay=3 or 4 (see this patch).

    What may be the reason of kernel panic?

     
  • Dmitry S. Lukhtionov

    1. Update you system to 8-STABLE (this update you em driver to version 7.3.2)
    2. Add a missed lines like that:
    if (rad_class)
    free(rad_class);
    in RadsrvEvent()
    3. Remove "options FLOWTABLE" from you kernel
    4. Try to set net.inet.ip.fastforwarding to 0

     
  • Dmitry S. Lukhtionov

    See my last commit and try it.

     
  • Egor

    Egor - 2012-05-28

    Dmity, thank you for the answer!

    1. Update you system to 8-STABLE (this update you em driver to version 7.3.2)

    I'm updating freebsd on servers now, but it will take about month, and I want to undestand what is the reason of crashes - new mpd, CoA patch or Delay patch? Because everything was fine before mpd was updated.

    2. Add a missed lines like that:
    if (rad_class)
        free(rad_class);
    in RadsrvEvent()

    Thanks, I'll fix it. But I've checked MPD memory size on the servers, and it's not increasing! Also there is more then 1,5G free RAM and 2G free swap right before panic. It doesn't looks like this bug is the reason of crashes.

    3. Remove "options FLOWTABLE" from you kernel

    I have this option deleted.

    4. Try to set net.inet.ip.fastforwarding to 0

    I'll try it. I knew that there is a problem with net.isr.direct=0 and mpd. Are there any problems with fastforwarding?

    See my last commit and try it.

    What commit are you talking about? Please, give me a link!

     
  • Dmitry S. Lukhtionov

    fetch last sources from CVS. They always contains last fresh fixes.

     

Log in to post a comment.