#15 Racoon Freezing

closed
None
3
2009-01-16
2005-04-15
Anonymous
No

Kernel 2.6.8 with ipsec-tools 0.3 and 2.6.11 with ipsec-tools 0.5 with
over 250 vpn connections randomly racoon freezes, the
connection stay working till their lifetime expires.

Killing racoon with kill -9 is the only way to stop it and then restart
racoon and all is well till it freezes again.

No information in the log to indicate the reason for failure.

Any idea on how a watchdog would work with racoon ??

Discussion

1 2 > >> (Page 1 of 2)
  • Mike Robinson
    Mike Robinson
    2005-05-12

    Logged In: YES
    user_id=854356

    So, the racoon server goes dead and takes down all those
    connections with it? That kind of problem would be
    extremely difficult to diagnose under any circumstances.

    The first thing I'd try is to use a system-monitor (like
    'top') to see why the racoon process is frozen... is it
    churning away CPU-time in a 100%-busy loop, or is it
    waiting, and if so, for what?

    The next thing I'd try is to see if you can make any sort
    of correlation between the number-of-clients and the
    freezes. Notice also if any particular client, group of
    clients, part-of-the-building and so forth appear to be
    the ones who did it. Look to see who logged-in most
    recently. Things like that.

    You can get a core-dump when you kill a process, to see
    exactly what the process was doing pre-mortem.

    Anyhow, this kind of problem is definitely going to take
    some forensics before the cause becomes apparent. The
    mere fact that it is freezing-at-random, by itself, is
    just not sufficient to point at a solution.

     
  • Logged In: NO

    Extra Info.

    currently runing and racoon stops accepting requests
    (including requests from racoonctl)

    Last log event is 'call pfkey_send_dump' each time that
    appears racoon stops accepting requests.

     
  • Aidas Kasparas
    Aidas Kasparas
    2005-06-15

    Logged In: YES
    user_id=39627

    Can you try to attach to that process with gdb? (gdb -p
    process-number)
    and get backtrace what it is doing (bt command)

    Also, could you please provide not just last log entry, but
    more, at least few seconds.

     
  • Logged In: NO

    system is in production so had to reboot.. (that stops the
    problem for a few weeks)

    a Log file is available at
    http://dev.waveworks.co.uk/racoon.log.gz
    which contains 10 seconds worth (on debug2) covering 22MB
    (gziped to 822k)

     
  • Aidas Kasparas
    Aidas Kasparas
    2005-06-15

    • priority: 5 --> 3
    • assigned_to: nobody --> monas
     
  • Aidas Kasparas
    Aidas Kasparas
    2005-06-15

    Logged In: YES
    user_id=39627

    FIRST, I'm sorry to tell, but by providing level 2 debug
    log, you provided confidential PSK too. Please act like that
    PSK was compromised. In future, please do not use level 2
    debug without explicit request. Or remove DEBUG2 lines from
    the log before providing it to anyone.

    What I have spotted in your log is this:

    When you were restarting racoon very rapidly, one of first
    things what racoon did was requesting SPD from kernel. It
    gets this info in series of pfkey messages, one policy at a
    time. Meanwhile, he also gets "delete" payload in an
    informational message over ISAKMP socket. To handle this, it
    calls pfkey_dump_sadb, which opens **another** pfkey socket
    and **in blocking** mode tries to send request for kernel to
    dump SADB over this socket. Then, it waits for this request
    to be fulfilled, while kernel still trying to dump remaining
    SPD policies over the other socket.

    If, my hypothesis, that racoon waits on wrong socket and
    therefore freezes, holds, then it could be fixed by
    rearanging code to use single select for any pfkey
    communications. (not easy to do, but doable). But then, the
    only explanation why racoon freezes not during startup would
    be that it gets remove message from peer while acquire (or
    similar) message is already in the queue for main pfkey
    socket. Is it sounds feasible?

    Maybe you have last few hundred lines of racoon log in debug
    mode, when it froze after much longer period of time? That
    would help to see if my hypotesis is right.

     
  • Aidas Kasparas
    Aidas Kasparas
    2005-06-15

    Logged In: YES
    user_id=39627

    Oops, old log won't help -- it will not show what was on
    pfkey queue while freezing... :-//

     
  • Logged In: NO

    Not to worry about the PSK, i know it was visible in the DEBUG2.

    Looking at 'top' at the time that racoon stops responding
    shows that racoon is doing nothing.. ie like it was waiting
    for a response.

    in the log where it say 'pfkey_send_dump' is where it
    stops.. there is nothing more after that, just more of the
    same till the next pfkey_send_dump.

    Since it's in production environment I have to keep it
    running, so cant leave it blocking.

     
  • Logged In: NO

    Extra Info

    on restart of racoon there is only one entry in /proc/net/pfkey

    sk RefCnt Rmem Wmem User Inode
    f7c0a800 2 170476 0 0 245763

    then when racoon stops responding again there is two;

    sk RefCnt Rmem Wmem User Inode
    f7c0a680 2 0 0 0 245807
    f7c0a800 2 165244 0 0 245763

    It would be really great to solve this, as i'l be setting up
    another machine with 350 conections in the next few weeks.. :-(

     
  • Logged In: NO

    Hi,
    I am another user facing similar problem with IPsec tools
    0.99.0 and linux kernel 2.6.9.

    In my case, Racoon freezes exactly after 235 SAs (Actually
    470 SAs, considering each SA is unidirectional). It does
    not respond to any signal(Like cntrl+c). This case is
    reproducible.
    I am trying establish ESP transport between 2 machine. One
    machine multiple virtual IPs and other has only one IP.
    I am trying to have a different IPSec SA for each virtual
    IP.

    I thought my problem is same as the problem explained in
    this thread.
    Please let me know if you need any further info.

    Thanks and Regards,
    Akhilesh
    akhi_guatm@yahoo.com

     
1 2 > >> (Page 1 of 2)