|
From: JL <ips...@rr...> - 2013-11-07 16:22:33
|
I have a problem I have been unable to get to the bottom of.
I have a VPN tunnel where my end is racoon on linux. I suspect the other
end of the VPN tunnel is interpreting all soft timeouts as hard timeouts.
But I don't know what to do about it.
So, for example, if I have a VPN which has, amongst the normal other config:
remote ZZZZZ {
exchange mode main;
lifetime time 15 minute;
proposal { ... }
}
sainfo address XXXXX/XX any address YYYYY/YY any {
lifetime time 5 minute;
...
}
sainfo address YYYYY/YY any address XXXXX/XX any {
lifetime time 5 minute;
...
}
Then what happens is:
* 00:00 - Phase 1 negotiates, Phase 2 negotiates SA1, Ping succeeds
* 00:04 - SA1 hits softlimit; new Phase 2 negotiates new SA2
* 00:05 - SA1 hits hardlimit; is deleted
* 00:08 - SA2 hits softlimit; new Phase 2 negotiates new SA3
* 00:09 - SA2 hits hardlimit; is deleted
* 00:12 - SA3 hits softlimit; new Phase 2 FAILS to negotiate & Pings fail
* 00:13 - Multiple failed Phase 2 attempts
* 00:14 - Multiple failed Phase 2 attempts
* 00:15 - Phase 1 negotiates, Phase 2 negotiates SA4, Ping succeeds
* 00:19 - SA4 hits softlimit; new Phase 2 negotiates new SA5
* 00:20 - SA4 hits hardlimit; is deleted
* 00:23 - SA5 hits softlimit; new Phase 2 negotiates new SA6
* 00:24 - SA5 hits hardlimit; is deleted
* 00:27 - SA6 hits softlimit; new Phase 2 FAILS to negotiate & Pings fail
* 00:28 - Multiple failed Phase 2 attempts
* 00:29 - Multiple failed Phase 2 attempts
* 00:30 - Phase 1 negotiates, Phase 2 negotiates SA7, Ping succeeds
... And so on. The linked PDF shows this in a graphical form. (The timings
are slightly fudged; by the time we get to 00:30 it is really 00:30:20-ish;
but for clarity I am rounding everything back to minute boundaries).
VPN Trace.pdf<https://docs.google.com/file/d/0B1FWjXAdhWdabkZJeElkeEhtZ1E/edit?usp=drive_web>
I know these timings are silly - I am specifically using them to shrink the
amount of time I needed to discover what was going on. This same pattern
holds true even with realistic lifetimes, like 8hr/1hr. It also holds true
if I set the Phase 1 and SA lifetimes to the same value.
I have no control over or ability to debug the far end. I don't even know
what hardware/software it is running.
It appears to me that there are two problems.
1) We cannot negotiate a Phase 2 if the lifetime of the Phase 2 would be
longer than the remaining time on the current Phase 1, and
2) On failing to negotiate a Phase 2, the dying Phase 2 is not accepted
by the other end.
At any point I am in this problem, I can restart racoon while deleting the
SAD entries. This forces a new Phase 1, followed by a new Phase 2, and I am
fine - until, of course, the system needs to negotiate the Phase 2 which
will overlap the end of the Phase 1 lifetime. This is a fine work-around;
but it is not a solution - I need something that will not result in dropped
packets or manual intervention.
The trouble is, I have no idea what to do about this. Is there a way to
convince racoon to lie during the ISAKMP, so that racoon will always expire
the Phase 1 while the far end thinks it is still valid; and early enough
that the Phase 2 that would fail will initiate a new Phase 1? Or is there
another solution anyone can think of?
Thanks,<https://docs.google.com/file/d/0B1FWjXAdhWdabkZJeElkeEhtZ1E/edit?usp=drive_web>
--
Jarrod Lowe
|