From: Ico <ra...@ze...> - 2011-10-21 11:51:40
|
Hello, I'm having some troubles using ipsec in anonymous mode with DPD and peers with dynamic addresses: when a peer reconnects from a different address, the ipsec connection is not successfull. The situation is as follows: - Racoon runs in anonymous mode, dpd enabled. See configuration below: [1,2]. - A peer connects, phase1 and phase2 established successfully, ipsec connection is functional. - The remote peer disconnects from the internet without first bringing down the tunnel. (for example, DSL connection retrains) - The remote peer reconnects to the internet, but now has a different IP address. (common case with DSL in germany). - The peer connects, successful phase1 and phase2 established. - DPD kicks in, and decides the old connection is stale, and goes to delete the old policies. - After this the new connection is no longer functional. setkey -D -P no longer shows the proper entries. My sucpicion is that the DPD is cleaning up too much, and instead of only cleaning up the old entries, also removes the new ones. I have followed the trace for deleting dpd's from racoon to libipsec into the kernel and from what I understand from the code so far, it seems that any policy matching the source and destination address is deleted. Since the new and old policies are the same because it is basically the same peer connecting with the same local addresses, the policy is deleted while it should be preserved for the new connection. Can anybody confirm any of the above ? Is this a known issue and do my conclusions make any sens ? The most important, is there a solution to this problem ? Thank you, Ico -[ 1. racoon.conf ]--------------------------------------------------------------- path pre_shared_key "/var/psk.txt"; path certificate "/var/cert"; log info; listen { adminsock "/var/racoon.sock" "root" "root" 0660; } remote anonymous { nat_traversal on; exchange_mode main; lifetime time 50000 sec; proposal_check obey; dpd_retry 5; dpd_maxfail 4; dpd_delay 20; generate_policy on; passive on; verify_cert on; my_identifier asn1dn; peers_identifier asn1dn; certificate_type x509 "XXXXXXXXXXX" "XXXXXXXXXXX"; proposal { encryption_algorithm aes256; hash_algorithm sha1; authentication_method rsasig; dh_group modp1024; } } sainfo anonymous { pfs_group modp1024; lifetime time 50000 sec; encryption_algorithm aes256; authentication_algorithm hmac_sha1; compression_algorithm deflate; } timer { natt_keepalive 20sec; } -[ 2. system info ]--------------------------------------------------------------- # cat /proc/version Linux version 2.6.8.1 (xxx@xxxx) (gcc version 3.4.2) #1 Tue Aug 30 16:53:23 CEST 2011 # racoon -V @(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net) Compiled with: - OpenSSL 0.9.7f 22 Mar 2005 (http://www.openssl.org/) - Dead Peer Detection - NAT Traversal - Admin port - Monotonic clock -- :wq ^X^Cy^K^X^C^C^C^C |
From: Wolfgang S. <wol...@di...> - 2011-10-22 10:15:08
|
Hello Ico, I did try your scenario in a simulated environment with multiple virtual machines. I see a little different behaviour: I disabled the remote site router's ethernet port (which simulates your internet disconnection). After the DPD of the remote peer had removed the security policy, I enabled the router's eth again (with the same IP address as a first try). The peer was still thinking, that the connection was established. With tcpdump I could also see the tunnelled ping's appearing at the remote peer, but they are being dropped, because there is no policy anymore. I solved this by enabling also the dpd at the peer's site (not only the Remote peer's site). That initiated a new phase 1 negotiation, and the tunnel was re-established successfully. Don't know if this info helps ? Regards Wolfgang -----Ursprüngliche Nachricht----- Von: Ico [mailto:ra...@ze...] Gesendet: Freitag, 21. Oktober 2011 13:51 An: ipsec-tools-devel Betreff: [!! SPAM] [Ipsec-tools-devel] racoon, DPD and peers with dynamic addresses Hello, I'm having some troubles using ipsec in anonymous mode with DPD and peers with dynamic addresses: when a peer reconnects from a different address, the ipsec connection is not successfull. The situation is as follows: - Racoon runs in anonymous mode, dpd enabled. See configuration below: [1,2]. - A peer connects, phase1 and phase2 established successfully, ipsec connection is functional. - The remote peer disconnects from the internet without first bringing down the tunnel. (for example, DSL connection retrains) - The remote peer reconnects to the internet, but now has a different IP address. (common case with DSL in germany). - The peer connects, successful phase1 and phase2 established. - DPD kicks in, and decides the old connection is stale, and goes to delete the old policies. - After this the new connection is no longer functional. setkey -D -P no longer shows the proper entries. My sucpicion is that the DPD is cleaning up too much, and instead of only cleaning up the old entries, also removes the new ones. I have followed the trace for deleting dpd's from racoon to libipsec into the kernel and from what I understand from the code so far, it seems that any policy matching the source and destination address is deleted. Since the new and old policies are the same because it is basically the same peer connecting with the same local addresses, the policy is deleted while it should be preserved for the new connection. Can anybody confirm any of the above ? Is this a known issue and do my conclusions make any sens ? The most important, is there a solution to this problem ? Thank you, Ico -[ 1. racoon.conf ]--------------------------------------------------------------- path pre_shared_key "/var/psk.txt"; path certificate "/var/cert"; log info; listen { adminsock "/var/racoon.sock" "root" "root" 0660; } remote anonymous { nat_traversal on; exchange_mode main; lifetime time 50000 sec; proposal_check obey; dpd_retry 5; dpd_maxfail 4; dpd_delay 20; generate_policy on; passive on; verify_cert on; my_identifier asn1dn; peers_identifier asn1dn; certificate_type x509 "XXXXXXXXXXX" "XXXXXXXXXXX"; proposal { encryption_algorithm aes256; hash_algorithm sha1; authentication_method rsasig; dh_group modp1024; } } sainfo anonymous { pfs_group modp1024; lifetime time 50000 sec; encryption_algorithm aes256; authentication_algorithm hmac_sha1; compression_algorithm deflate; } timer { natt_keepalive 20sec; } -[ 2. system info ]--------------------------------------------------------------- # cat /proc/version Linux version 2.6.8.1 (xxx@xxxx) (gcc version 3.4.2) #1 Tue Aug 30 16:53:23 CEST 2011 # racoon -V @(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net) Compiled with: - OpenSSL 0.9.7f 22 Mar 2005 (http://www.openssl.org/) - Dead Peer Detection - NAT Traversal - Admin port - Monotonic clock -- :wq ^X^Cy^K^X^C^C^C^C ---------------------------------------------------------------------------- -- The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning@Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Ipsec-tools-devel mailing list Ips...@li... https://lists.sourceforge.net/lists/listinfo/ipsec-tools-devel |
From: Ico D. <ra...@ze...> - 2011-10-23 18:03:41
|
Hi Wolfgang, * On Sat Oct 22 12:13:31 +0200 2011, Wolfgang Schmieder wrote: > I did try your scenario in a simulated environment with multiple virtual > machines. I see a little different behaviour: > > I disabled the remote site router's ethernet port (which simulates your > internet disconnection). After the DPD of the remote peer had removed the > security policy, I enabled the router's eth again (with the same IP address > as a first try). The peer was still thinking, that the connection was > established. With tcpdump I could also see the tunnelled ping's appearing at > the remote peer, but they are being dropped, because there is no policy > anymore. > > I solved this by enabling also the dpd at the peer's site (not only the > Remote peer's site). That initiated a new phase 1 negotiation, and the > tunnel was re-established successfully. >From the above I'd say your scenario is different from ours, so it shows a bit different behaviour. We already have DPD enabled on both sides yet, but that does not prevent our problem from happening. In our setup, the problem seems to be caused by dead pear detection at the local end: the remote peer disappears and reconnects *before* DPD triggers and a new connection is established. The local peer now sees two connections at the same time. After a short time, the oldest of the two times out because of DPD, so the policies are removed. At this time, we see that no policies are left, not even of the recently connected new tunnel. Our guess is that the removal of policies because of the DPD clean up too much, also causing the new policies to be removed. The problem is that the remote peer disconnects from the internet, reconnects within a handful of seconds, and establishes a new tunnel right away. This all happens *before* the DPD on the local peer detects that the tunnel has gone dead. As a workaround we decreased the DPD timeout to a very short time so that stale connections are cleared up within ten seconds or so. In this case the policies are removed *before* the new tunnel is established, so the problem effectively has gone away. But the situation still makes me wonder if the DPD cleanup functions as designed: is the cleanup for tunnel A supped to remove policies for tunnel B ? I think not. >From what I traced from racoon into the kernel, there is no way for the kernel to distinguish the two policies, because they are uniquely identified by only the local and remote addresses. Because both the old and new tunnels have the same addresses assigned, the kernel will decide to cleanup all matching policies. I see no easy fix for this. Thanks for your input, Ico -- :wq ^X^Cy^K^X^C^C^C^C |
From: Wolfgang S. <wol...@di...> - 2011-10-24 17:39:03
|
Hello Ico, I suppose that there will no second policy established when the second tunnel is built up. I believe that the existing one will be updated with the new global ip address of the tunnel endpoint. When the DPD deletes the first tunnel, I guess that the phase 1 or phase 2 handler deletes the updated policy. A fix could be, that the phase 1 or phase 2 handler first look if the policy is needed by any other existing tunnels, and skips the deletion then. These are all assumptions. I am not 100% sure whether I understand your configuration setup. If you send me the config files of both peers, I could try to debug the scenario and see if my assumptions are correct and if my above mentioned fix could be implemented. Regards Wolfgang -----Ursprüngliche Nachricht----- Von: Ico Doornekamp [mailto:ra...@ze...] Gesendet: Sonntag, 23. Oktober 2011 20:04 An: ipsec-tools-devel Betreff: Re: [Ipsec-tools-devel] [!! SPAM] racoon,DPD and peers with dynamic addresses Hi Wolfgang, * On Sat Oct 22 12:13:31 +0200 2011, Wolfgang Schmieder wrote: > I did try your scenario in a simulated environment with multiple virtual > machines. I see a little different behaviour: > > I disabled the remote site router's ethernet port (which simulates your > internet disconnection). After the DPD of the remote peer had removed the > security policy, I enabled the router's eth again (with the same IP address > as a first try). The peer was still thinking, that the connection was > established. With tcpdump I could also see the tunnelled ping's appearing at > the remote peer, but they are being dropped, because there is no policy > anymore. > > I solved this by enabling also the dpd at the peer's site (not only the > Remote peer's site). That initiated a new phase 1 negotiation, and the > tunnel was re-established successfully. >From the above I'd say your scenario is different from ours, so it shows a bit different behaviour. We already have DPD enabled on both sides yet, but that does not prevent our problem from happening. In our setup, the problem seems to be caused by dead pear detection at the local end: the remote peer disappears and reconnects *before* DPD triggers and a new connection is established. The local peer now sees two connections at the same time. After a short time, the oldest of the two times out because of DPD, so the policies are removed. At this time, we see that no policies are left, not even of the recently connected new tunnel. Our guess is that the removal of policies because of the DPD clean up too much, also causing the new policies to be removed. The problem is that the remote peer disconnects from the internet, reconnects within a handful of seconds, and establishes a new tunnel right away. This all happens *before* the DPD on the local peer detects that the tunnel has gone dead. As a workaround we decreased the DPD timeout to a very short time so that stale connections are cleared up within ten seconds or so. In this case the policies are removed *before* the new tunnel is established, so the problem effectively has gone away. But the situation still makes me wonder if the DPD cleanup functions as designed: is the cleanup for tunnel A supped to remove policies for tunnel B ? I think not. >From what I traced from racoon into the kernel, there is no way for the kernel to distinguish the two policies, because they are uniquely identified by only the local and remote addresses. Because both the old and new tunnels have the same addresses assigned, the kernel will decide to cleanup all matching policies. I see no easy fix for this. Thanks for your input, Ico -- :wq ^X^Cy^K^X^C^C^C^C ---------------------------------------------------------------------------- -- The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning@Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Ipsec-tools-devel mailing list Ips...@li... https://lists.sourceforge.net/lists/listinfo/ipsec-tools-devel |
From: Rainer W. <rwe...@mo...> - 2011-10-24 14:41:48
|
Ico <ra...@ze...> writes: [...] > - Racoon runs in anonymous mode, dpd enabled. See configuration below: > [1,2]. > > - A peer connects, phase1 and phase2 established successfully, ipsec > connection is functional. > > - The remote peer disconnects from the internet without first bringing > down the tunnel. (for example, DSL connection retrains) > > - The remote peer reconnects to the internet, but now has a different IP > address. (common case with DSL in germany). > > - The peer connects, successful phase1 and phase2 established. > > - DPD kicks in, and decides the old connection is stale, and goes to > delete the old policies. > > - After this the new connection is no longer functional. setkey -D -P no > longer shows the proper entries. > > My sucpicion is that the DPD is cleaning up too much, and instead of > only cleaning up the old entries, also removes the new ones. [...] > Can anybody confirm any of the above ? Is this a known issue and do my > conclusions make any sens ? The most important, is there a solution to > this problem ? I had to deal with this problem in the past, specifically, with iPhone clients switching from WiFi to 3G and vice versa. What I did was make the server to a purge_remote of the old iph1 whenever I got a request to establish a new one for the same user and device, delay establishment of the new SA until the stale one had been removed and restart the procedure afterwards. This was, of course, a code change and none which could be published easily because it depends on other changes being part of the racoon variant my employer uses. |
From: Ico D. <ra...@ze...> - 2011-10-24 17:09:07
|
* On Mon Oct 24 16:41:32 +0200 2011, Rainer Weikusat wrote: > Ico <ra...@ze...> writes: > > [...] > > > - Racoon runs in anonymous mode, dpd enabled. See configuration below: > > [1,2]. > > > > - A peer connects, phase1 and phase2 established successfully, ipsec > > connection is functional. > > > > - The remote peer disconnects from the internet without first bringing > > down the tunnel. (for example, DSL connection retrains) > > > > - The remote peer reconnects to the internet, but now has a different IP > > address. (common case with DSL in germany). > > > > - The peer connects, successful phase1 and phase2 established. > > > > - DPD kicks in, and decides the old connection is stale, and goes to > > delete the old policies. > > > > - After this the new connection is no longer functional. setkey -D -P no > > longer shows the proper entries. > > > > My sucpicion is that the DPD is cleaning up too much, and instead of > > only cleaning up the old entries, also removes the new ones. > > [...] > > > Can anybody confirm any of the above ? Is this a known issue and do my > > conclusions make any sens ? The most important, is there a solution to > > this problem ? > > I had to deal with this problem in the past, specifically, with iPhone > clients switching from WiFi to 3G and vice versa. Yes, sounds like the typical situation where one would expect this to happen. > What I did was make the server to a purge_remote of the old iph1 > whenever I got a request to establish a new one for the same user and > device, delay establishment of the new SA until the stale one had been > removed and restart the procedure afterwards. Thats pretty crude, but it sounds like a feasible solution. I guess you tried other options before deciding to make such a decision, so this might be our safest path to go. > This was, of course, a code change and none which could be published > easily because it depends on other changes being part of the racoon > variant my employer uses. No problem, we have our own build with local patches anyway, so we will probably be able to figure out how to implement this. Thanks for your feedback, Ico -- :wq ^X^Cy^K^X^C^C^C^C |
From: Ico D. <ra...@ze...> - 2011-10-24 19:08:38
|
Hi Wolfgang, * On Mon Oct 24 19:37:23 +0200 2011, Wolfgang Schmieder wrote: > I suppose that there will no second policy established when the second > tunnel is built up. I believe that the existing one will be updated with the > new global ip address of the tunnel endpoint. When the DPD deletes the first > tunnel, I guess that the phase 1 or phase 2 handler deletes the updated > policy. Yes, this could be the case. I must admit I did not check the state inbetween the setup of the new tunnel and the DPD cleanup, so you could be right that there is only one policy configured at that time. Sounds plausible. > A fix could be, that the phase 1 or phase 2 handler first look if the > policy is needed by any other existing tunnels, and skips the deletion > then. That might be the right thing to do, indeed. > These are all assumptions. I am not 100% sure whether I understand > your configuration setup. If you send me the config files of both > peers, I could try to debug the scenario and see if my assumptions are > correct and if my above mentioned fix could be implemented. Below are the setups of the two machines. The first is wat I called the 'local peer' in my earlier description; this machine plays the 'server' role in our setup. The second configuration is from the 'remote peer': a pretty straightforward configuration, no surprises here. To replay the scenario: - Establish a tunnel between 'remote' to 'local' - Disconnect 'remote' from the network, stop racoon, change its address, and have it reconnect by restarting racoon. This should be done within a few seconds, faster then the DPD timeout. - Watch the logs on the 'local' machine. You will see the new tunnel being setup from 'remote' *before* the old tunnel is cleaned up by DPD. A few seconds later the DPD timeout expires, and the policies are cleaned up. Thank you, Ico Configuration of anonymous peer: ============================================= path pre_shared_key "/var/psk.txt"; path certificate "/var/cert"; log info; listen { adminsock "/var/racoon.sock" "root" "root" 0660; } remote anonymous { nat_traversal off; exchange_mode main; lifetime time 3600 sec; proposal_check obey; dpd_retry 5; dpd_maxfail 4; dpd_delay 300; generate_policy on; passive on; verify_cert on; my_identifier asn1dn; peers_identifier asn1dn; certificate_type x509 "XXXX.cert" "XXXX.priv"; proposal { encryption_algorithm 3des; hash_algorithm md5; authentication_method rsasig; dh_group modp1024; } } sainfo anonymous { lifetime time 3600 sec; encryption_algorithm 3des; authentication_algorithm hmac_md5; compression_algorithm deflate; } Configuration of remote peer: ============================================= path pre_shared_key "/var/psk.txt"; path certificate "/var/cert"; log info; listen { adminsock "/var/racoon.sock" "root" "root" 0660; } remote XX.XX.XX.XX { nat_traversal off; exchange_mode main; lifetime time 3600 sec; proposal_check obey; dpd_retry 5; dpd_maxfail 4; dpd_delay 20; verify_cert on; my_identifier asn1dn; peers_identifier asn1dn; certificate_type x509 "XXXX.cert" "XXXX.priv"; proposal { encryption_algorithm 3des; hash_algorithm md5; authentication_method rsasig; dh_group modp1024; } } sainfo address 192.168.1.9/32 any address 192.168.1.2/32 any { lifetime time 3600 sec; encryption_algorithm 3des; authentication_algorithm hmac_md5; compression_algorithm deflate; } -- :wq ^X^Cy^K^X^C^C^C^C |
From: Ico D. <ra...@ze...> - 2011-10-25 05:52:03
|
Hello Wolfgang, * On Mon Oct 24 19:37:23 +0200 2011, Wolfgang Schmieder wrote: > I suppose that there will no second policy established when the second > tunnel is built up. I believe that the existing one will be updated with the > new global ip address of the tunnel endpoint. Yes, we re-tested this, and this is indeed the case. See the logging below (sorry for the line wrapping) Ipsec link established from 11.11.11.11: ---------------------------------------------------------- Oct 24 21:53:59 racoon: INFO: respond new phase 1 negotiation: 80.101.121.60[500]<=>11.11.11.11[500] Oct 24 21:53:59 racoon: INFO: begin Identity Protection mode. Oct 24 21:53:59 racoon: INFO: received Vendor ID: DPD Oct 24 21:54:00 racoon: INFO: ISAKMP-SA established 80.101.121.60[500]-11.11.11.11[500] spi:61708486451431ae:3c3c686d5c58d351 Oct 24 21:54:01 racoon: INFO: respond new phase 2 negotiation: 80.101.121.60[500]<=>11.11.11.11[500] Oct 24 21:54:01 racoon: INFO: no policy found, try to generate the policy : 192.168.1.9/32[0] 192.168.1.2/32[0] proto=any dir=in Oct 24 21:54:01 racoon: INFO: IPsec-SA established: ESP/Tunnel 11.11.11.11[500]->80.101.121.60[500] spi=63046120(0x3c201e8) Oct 24 21:54:01 racoon: INFO: IPsec-SA established: ESP/Tunnel 80.101.121.60[500]->11.11.11.11[500] spi=203299113(0xc1e1929) Oct 24 21:54:03 openvpn: 0.0.0.0:1194 [29149]: Peer Connection Initiated with 192.168.1.9:1194 Oct 24 21:54:04 openvpn: 0.0.0.0:1194 [29149]: Initialization Sequence Completed Policies with 11.11.11.11 as peer address: ------------------------------------------------------------- 192.168.1.9[any] 192.168.1.2[any] 255 in prio def ipsec esp/tunnel/11.11.11.11-80.101.121.60/require etc. etc. Remote Racoon killed with -9 (Peer does not notice) and remote link restarted with new IP address. Ipsec link established again with 22.22.22.22 as peer address: -------------------------------------------------------------------------------------- Oct 24 21:56:12 racoon: INFO: respond new phase 1 negotiation: 80.101.121.60[500]<=>22.22.22.22[500] Oct 24 21:56:12 racoon: INFO: begin Identity Protection mode. Oct 24 21:56:12 racoon: INFO: received Vendor ID: DPD Oct 24 21:56:13 racoon: INFO: ISAKMP-SA established 80.101.121.60[500]-22.22.22.22[500] spi:10e93032b81b82d2:d7e72bb68b5ba7e1 Oct 24 21:56:13 racoon: [22.22.22.22] INFO: received INITIAL-CONTACT Oct 24 21:56:14 racoon: INFO: respond new phase 2 negotiation: 80.101.121.60[500]<=>22.22.22.22[500] Oct 24 21:56:14 racoon: INFO: Update the generated policy : 192.168.1.9/32[0] 192.168.1.2/32[0] proto=any dir=in Oct 24 21:56:14 racoon: INFO: IPsec-SA established: ESP/Tunnel 22.22.22.22[500]->80.101.121.60[500] spi=232612799(0xddd63bf) Oct 24 21:56:14 racoon: INFO: IPsec-SA established: ESP/Tunnel 80.101.121.60[500]->22.22.22.22[500] spi=66343000(0x3f45058) Now, no more policies with 11.11.11.11 as peer address. Now, only policies with 22.22.22.22 as peer address: -------------------------------------------------------------------------- 192.168.1.9[any] 192.168.1.2[any] 255 in prio def ipsec esp/tunnel/22.22.22.22-80.101.121.60/require created: Oct 24 21:56:14 2011 lastused: Oct 24 21:56:46 2011 lifetime: 3600(s) validtime: 0(s) spid=480 seq=18 pid=29933 refcnt=4 192.168.1.2[any] 192.168.1.9[any] 255 out prio def ipsec esp/tunnel/80.101.121.60-22.22.22.22/require created: Oct 24 21:56:14 2011 lastused: Oct 24 21:56:39 2011 lifetime: 3600(s) validtime: 0(s) spid=497 seq=17 pid=29933 refcnt=4 192.168.1.9[any] 192.168.1.2[any] 255 fwd prio def ipsec esp/tunnel/22.22.22.22-80.101.121.60/require created: Oct 24 21:56:14 2011 lastused: lifetime: 3600(s) validtime: 0(s) spid=490 seq=16 pid=29933 refcnt=2 Now old connection with 11.11.11.11 is removed as a result of DPD time-out: ----------------------------------------------------------------------------------------------------------- Oct 24 21:59:20 racoon: [11.11.11.11] INFO: DPD: remote (ISAKMP-SA spi=61708486451431ae:3c3c686d5c58d351) seems to be dead. Oct 24 21:59:20 racoon: INFO: purging ISAKMP-SA spi=61708486451431ae:3c3c686d5c58d351. Oct 24 21:59:20 racoon: INFO: deleting a generated policy. Oct 24 21:59:20 racoon: INFO: purged IPsec-SA spi=203299113. Oct 24 21:59:20 racoon: INFO: purged IPsec-SA spi=63046120. Oct 24 21:59:20 racoon: INFO: purged ISAKMP-SA spi=61708486451431ae:3c3c686d5c58d351. Oct 24 21:59:20 racoon: INFO: ISAKMP-SA deleted 80.101.121.60[500]-11.11.11.11[500] spi:61708486451431ae:3c3c686d5c58d351 Now: No more policies at all (but Phase1 and Phase2 with 22.22.22.22 are still up) : ---------------------------------------------------------------------------------------------------------- (per-socket policy) in none created: Oct 24 21:53:49 2011 lastused: lifetime: 0(s) validtime: 0(s) spid=467 seq=15 pid=30603 refcnt=1 (per-socket policy) in none created: Oct 24 21:53:49 2011 lastused: lifetime: 0(s) validtime: 0(s) spid=451 seq=14 pid=30603 refcnt=1 (per-socket policy) in none created: Oct 24 21:53:49 2011 lastused: lifetime: 0(s) validtime: 0(s) spid=435 seq=13 pid=30603 refcnt=1 -- :wq ^X^Cy^K^X^C^C^C^C |
From: Rainer W. <rwe...@mo...> - 2011-10-25 13:56:48
|
Ico Doornekamp <ra...@ze...> writes: [...] > -------------------------------------------------------------------------------------- > Oct 24 21:56:12 racoon: INFO: respond new phase 1 negotiation: > 80.101.121.60[500]<=>22.22.22.22[500] > Oct 24 21:56:12 racoon: INFO: begin Identity Protection mode. > Oct 24 21:56:12 racoon: INFO: received Vendor ID: DPD > Oct 24 21:56:13 racoon: INFO: ISAKMP-SA established > 80.101.121.60[500]-22.22.22.22[500] spi:10e93032b81b82d2:d7e72bb68b5ba7e1 > Oct 24 21:56:13 racoon: [22.22.22.22] INFO: received INITIAL-CONTACT > Oct 24 21:56:14 racoon: INFO: respond new phase 2 negotiation: > 80.101.121.60[500]<=>22.22.22.22[500] > Oct 24 21:56:14 racoon: INFO: Update the generated policy : > 192.168.1.9/32[0] 192.168.1.2/32[0] proto=any dir=in > Oct 24 21:56:14 racoon: INFO: IPsec-SA established: ESP/Tunnel > 22.22.22.22[500]->80.101.121.60[500] spi=232612799(0xddd63bf) > Oct 24 21:56:14 racoon: INFO: IPsec-SA established: ESP/Tunnel > 80.101.121.60[500]->22.22.22.22[500] spi=66343000(0x3f45058) This is actually fine. [...] > ----------------------------------------------------------------------------------------------------------- > Oct 24 21:59:20 racoon: [11.11.11.11] INFO: DPD: remote > (ISAKMP-SA spi=61708486451431ae:3c3c686d5c58d351) seems to be dead. > Oct 24 21:59:20 racoon: INFO: purging ISAKMP-SA > spi=61708486451431ae:3c3c686d5c58d351. > Oct 24 21:59:20 racoon: INFO: deleting a generated policy. But it shouldn't delete the policy here, cf /* Check if the generated SPD has the same timestamp as the SA. * If timestamps are different, this means that the SPD entry has been * refreshed by another SA, and should NOT be deleted with the current SA. */ if( created ){ struct secpolicy *p; p = getsp(&spidx); if(p != NULL){ /* just do no test if p is NULL, because this probably just means * that the policy has already be deleted for some reason. */ if(p->spidx.created != created) goto purge; } } [delete_spd/ isakmp.c] And this leads to something I consider to be the actual bug here. When looking at the src/racoon/pfkey.c file, lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else created = 0; #ifdef HAVE_PFKEY_POLICY_PRIORITY KEY_SETSECSPIDX(xpl->sadb_x_policy_dir, saddr + 1, daddr + 1, saddr->sadb_address_prefixlen, daddr->sadb_address_prefixlen, saddr->sadb_address_proto, xpl->sadb_x_policy_priority, created, &spidx); #else KEY_SETSECSPIDX(xpl->sadb_x_policy_dir, saddr + 1, daddr + 1, saddr->sadb_address_prefixlen, daddr->sadb_address_prefixlen, saddr->sadb_address_proto, created, &spidx); #endif [pk_recvspdupdate] all functions dealing with 'create' timestamps use the addtime from the so-called 'hard lifetime extension'. According to RFC2367, this is wrong, cf sadb_lifetime_addtime For CURRENT, the time, in seconds, when the association was created. For HARD and SOFT, the number of seconds after the creation of the association until it expires. and the 'current' lifetime extension should be used instead. You could try to following patch 0.8.0 to see if this helps: ------------------------------------------- diff -prNu ipsec-tools-0.8.0/src/racoon/pfkey.c ipsec-tools-0.8.0.patched/src/racoon/pfkey.c --- ipsec-tools-0.8.0/src/racoon/pfkey.c 2011-10-25 14:53:16.124628501 +0100 +++ ipsec-tools-0.8.0.patched/src/racoon/pfkey.c 2011-10-25 14:52:26.000000000 +0100 @@ -2303,7 +2303,7 @@ pk_recvspdupdate(mhp) saddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_SRC]; daddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_DST]; xpl = (struct sadb_x_policy *)mhp[SADB_X_EXT_POLICY]; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else @@ -2441,7 +2441,7 @@ pk_recvspdadd(mhp) saddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_SRC]; daddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_DST]; xpl = (struct sadb_x_policy *)mhp[SADB_X_EXT_POLICY]; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else @@ -2573,7 +2573,7 @@ pk_recvspddelete(mhp) saddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_SRC]; daddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_DST]; xpl = (struct sadb_x_policy *)mhp[SADB_X_EXT_POLICY]; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else @@ -2649,7 +2649,7 @@ pk_recvspdexpire(mhp) saddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_SRC]; daddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_DST]; xpl = (struct sadb_x_policy *)mhp[SADB_X_EXT_POLICY]; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else @@ -2740,7 +2740,7 @@ pk_recvspddump(mhp) saddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_SRC]; daddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_DST]; xpl = (struct sadb_x_policy *)mhp[SADB_X_EXT_POLICY]; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else @@ -3386,7 +3386,7 @@ pk_recvmigrate(mhp) saddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_SRC]; daddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_DST]; xpl = (struct sadb_x_policy *)mhp[SADB_X_EXT_POLICY]; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if (lt != NULL) created = lt->sadb_lifetime_addtime; else @@ -3731,12 +3731,7 @@ addnewsp(mhp, local, remote) saddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_SRC]; daddr = (struct sadb_address *)mhp[SADB_EXT_ADDRESS_DST]; xpl = (struct sadb_x_policy *)mhp[SADB_X_EXT_POLICY]; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; - if(lt != NULL) - created = lt->sadb_lifetime_addtime; - else - created = 0; - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else |
From: Ico D. <ra...@ze...> - 2011-10-25 15:03:38
|
Hi Wolfgang, > [delete_spd/ isakmp.c] > > And this leads to something I consider to be the actual bug here. When > looking at the src/racoon/pfkey.c file, > [...] > all functions dealing with 'create' timestamps use the addtime from > the so-called 'hard lifetime extension'. According to RFC2367, this is > wrong, cf > > sadb_lifetime_addtime > For CURRENT, the time, in seconds, when the > association was created. For HARD and SOFT, the > number of seconds after the creation of the > association until it expires. > > and the 'current' lifetime extension should be used instead. You could > try to following patch 0.8.0 to see if this helps: Thanks so much for your help. Patch applied, we'll go test shortly! -- :wq ^X^Cy^K^X^C^C^C^C |
From: Ico D. <ra...@ze...> - 2011-10-25 16:41:30
|
Hi Rainer, (I mixed you up with Wolfgang in my last reply, my apologies) > all functions dealing with 'create' timestamps use the addtime from > the so-called 'hard lifetime extension'. According to RFC2367, this is > wrong, cf > > sadb_lifetime_addtime > For CURRENT, the time, in seconds, when the > association was created. For HARD and SOFT, the > number of seconds after the creation of the > association until it expires. > > and the 'current' lifetime extension should be used instead. You could > try to following patch 0.8.0 to see if this helps: Unfortunately, no improvement. Exactly the same result as before: the updated policy gets deleted when the old policies are cleared by DPD. I must admit I do not quite understand why using the 'current' lifetime instead of the 'hard' would prevent this from happening, but then again there is a lot to the gritty details of ipsec/racoon I don't grasp yet. Thanks! Ico -- :wq ^X^Cy^K^X^C^C^C^C |
From: Rainer W. <rwe...@mo...> - 2011-10-25 17:51:24
|
Ico Doornekamp <ra...@ze...> writes: >> all functions dealing with 'create' timestamps use the addtime from >> the so-called 'hard lifetime extension'. According to RFC2367, this is >> wrong, cf >> >> sadb_lifetime_addtime >> For CURRENT, the time, in seconds, when the >> association was created. For HARD and SOFT, the >> number of seconds after the creation of the >> association until it expires. >> >> and the 'current' lifetime extension should be used instead. You could >> try to following patch 0.8.0 to see if this helps: > > Unfortunately, no improvement. Exactly the same result as before: the > updated policy gets deleted when the old policies are cleared by DPD. > > I must admit I do not quite understand why using the 'current' lifetime > instead of the 'hard' would prevent this from happening, In theory, because racoon tries to determine if a generated policy should be deleted by comparing the 'created' timestamp in the policy with the created timestamp from the corresponding ph2 SA. If the policy was updated since the ph2 SA which is about to be deleted had been created, eg, because a 'new VPN' was established in the mean time, the generated policy is supposed to be kept. But for this to work, the policy created timestamp actually needs to be updated. Since the addtime value from the 'hard' lifetime extension is the number of seconds after the creation of the association until it expires [http://www.faqs.org/rfcs/rfc2367.html] updating the policy timestamp by copying this value into it cannot possibly work. The value from the 'current' lifetime extension needs to be used instead because that's actually the time when the corresponding 'IPsec thing' was created, For CURRENT, the time, in seconds, when the association was created. [dito] I've since managed to reproduce this phenomenon here and I think I also know how to solve it (basically, DPD failures cause purge_remote to be called and purge_remote calls delete_spd with a ph2 create timestamp of 0, causing the timestamp comparison to be skipped). I'm going to do another round of tests here, though, in order to be certain about that (bug also affects me, at least theoretically, and because of this, I'm going to fix it at least in the racoon code I'm using). |
From: Rainer W. <rwe...@mo...> - 2011-10-25 19:19:26
|
Rainer Weikusat <rwe...@mo...> writes: [...] > I've since managed to reproduce this phenomenon here and I think I > also know how to solve it (basically, DPD failures cause purge_remote > to be called and purge_remote calls delete_spd with a ph2 create > timestamp of 0, causing the timestamp comparison to be skipped). > > I'm going to do another round of tests here, though, in order to be > certain about that The following two additional changes fix this for me (in the sense that an iPhone going from WiFi to 3G and establishisng a new VPN using the same ph2 ID as the now stale one without tearing the old tunnel down doesn't have its updated policy deleted when DPD kicks in for the stale tunnel when using the 'CURRENT' lifetime and the updated policy being deleted when the 'HARD' lifetime is being used instead). These are two changes, the first adds support for 'created' timestamp comparison to purge_remote and the second replaces another use of the HARD lifetime which should very likely have been CURRENT with that. NB: This has been tested with 0.7.3 and the scenario above. I've tested that the 0.8.0 change compiles. While I'm reasonably confident that this is correct, I've been wrong in the past, and if this causes the daemon to impregnate someone else's underage daughter, I'm going to disclaim any responsibility for that. ----------------------------- --- ipsec-tools-0.8.0.patched/src/racoon/isakmp.c 2011-10-25 14:37:28.070736240 +0100 +++ ipsec-tools-0.8.0.patched-more//src/racoon/isakmp.c 2011-10-25 19:48:31.000000000 +0100 @@ -3252,10 +3252,12 @@ purge_remote(iph1) vchar_t *buf = NULL; struct sadb_msg *msg, *next, *end; struct sadb_sa *sa; + struct sadb_lifetime *lt; struct sockaddr *src, *dst; caddr_t mhp[SADB_EXT_MAX + 1]; u_int proto_id; struct ph2handle *iph2; + u_int64_t created; struct ph1handle *new_iph1; plog(LLV_INFO, LOCATION, NULL, @@ -3308,6 +3310,11 @@ purge_remote(iph1) pk_fixup_sa_addresses(mhp); src = PFKEY_ADDR_SADDR(mhp[SADB_EXT_ADDRESS_SRC]); dst = PFKEY_ADDR_SADDR(mhp[SADB_EXT_ADDRESS_DST]); + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; + if(lt != NULL) + created = lt->sadb_lifetime_addtime; + else + created = 0; if (sa->sadb_sa_state != SADB_SASTATE_LARVAL && sa->sadb_sa_state != SADB_SASTATE_MATURE && @@ -3373,7 +3380,7 @@ purge_remote(iph1) /* delete a relative phase 2 handle. */ if (iph2 != NULL) { - delete_spd(iph2, 0); + delete_spd(iph2, created); remph2(iph2); delph2(iph2); } diff -prNu ipsec-tools-0.8.0.patched/src/racoon/isakmp_inf.c ipsec-tools-0.8.0.patched-more//src/racoon/isakmp_inf.c --- ipsec-tools-0.8.0.patched/src/racoon/isakmp_inf.c 2011-10-25 14:37:27.334827826 +0100 +++ ipsec-tools-0.8.0.patched-more//src/racoon/isakmp_inf.c 2011-10-25 19:46:11.000000000 +0100 @@ -1158,7 +1158,7 @@ purge_ipsec_spi(dst0, proto, spi, n) pk_fixup_sa_addresses(mhp); src = PFKEY_ADDR_SADDR(mhp[SADB_EXT_ADDRESS_SRC]); dst = PFKEY_ADDR_SADDR(mhp[SADB_EXT_ADDRESS_DST]); - lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_HARD]; + lt = (struct sadb_lifetime*)mhp[SADB_EXT_LIFETIME_CURRENT]; if(lt != NULL) created = lt->sadb_lifetime_addtime; else |
From: Ico D. <ra...@ze...> - 2011-10-25 18:44:05
|
* On Tue Oct 25 19:51:01 +0200 2011, Rainer Weikusat wrote: > Ico Doornekamp <ra...@ze...> writes: > >> all functions dealing with 'create' timestamps use the addtime from > >> the so-called 'hard lifetime extension'. According to RFC2367, this is > >> wrong, cf > >> > >> and the 'current' lifetime extension should be used instead. You could > >> try to following patch 0.8.0 to see if this helps: > > > > I must admit I do not quite understand why using the 'current' lifetime > > instead of the 'hard' would prevent this from happening, > > In theory, because racoon tries to determine if a generated policy > should be deleted by comparing the 'created' timestamp in the policy > with the created timestamp from the corresponding ph2 SA. If the > policy was updated since the ph2 SA which is about to be deleted had > been created, eg, because a 'new VPN' was established in the mean > time, the generated policy is supposed to be kept. But for this to > work, the policy created timestamp actually needs to be updated. Ok, that makes sense, thanks for providing the details. > I've since managed to reproduce this phenomenon here Great > and I think I also know how to solve it Even greater! > (basically, DPD failures cause purge_remote to be called and > purge_remote calls delete_spd with a ph2 create timestamp of 0, > causing the timestamp comparison to be skipped). > > I'm going to do another round of tests here, though, in order to be > certain about that (bug also affects me, at least theoretically, and > because of this, I'm going to fix it at least in the racoon code I'm > using). Ok, thanks again for your time, Ico -- :wq ^X^Cy^K^X^C^C^C^C |
From: Ico D. <ra...@ze...> - 2011-10-26 12:03:47
|
Hi Rainer, * On Tue Oct 25 21:19:10 +0200 2011, Rainer Weikusat wrote: > NB: This has been tested with 0.7.3 and the scenario above. I've > tested that the 0.8.0 change compiles. While I'm reasonably confident > that this is correct, I've been wrong in the past, and if this causes > the daemon to impregnate someone else's underage daughter, I'm going > to disclaim any responsibility for that. Well, my daughter is still perfectly fine - I'm very happy she is at her age - and my PDP troubles are gone, what else could I ask for! I've successfully applied your patch to 0.8.0, and the first few tests show that the behaviour is as expected now: the policies are kept alive as they should, and new connection is functional after DPD. Is there anything I can do to propose this patch for inclusion in racoon, or will someone pick this up from our discussion ? Thanks again for your time. -- :wq ^X^Cy^K^X^C^C^C^C |
From: VANHULLEBUS Y. <va...@fr...> - 2011-10-26 13:09:22
|
On Wed, Oct 26, 2011 at 02:03:38PM +0200, Ico Doornekamp wrote: > Hi Rainer, Hi all. > * On Tue Oct 25 21:19:10 +0200 2011, Rainer Weikusat wrote: > > > NB: This has been tested with 0.7.3 and the scenario above. I've > > tested that the 0.8.0 change compiles. While I'm reasonably confident > > that this is correct, I've been wrong in the past, and if this causes > > the daemon to impregnate someone else's underage daughter, I'm going > > to disclaim any responsibility for that. > > Well, my daughter is still perfectly fine - I'm very happy she is at her > age - and my PDP troubles are gone, what else could I ask for! > > I've successfully applied your patch to 0.8.0, and the first few tests > show that the behaviour is as expected now: the policies are kept alive > as they should, and new connection is functional after DPD. > > Is there anything I can do to propose this patch for inclusion in > racoon, or will someone pick this up from our discussion ? Just send me a reminder if you don't see the commit or more questions on the subject in the next few weeks :-) Yvan. |