From: Guennadi L. <gl...@ds...> - 2007-02-23 16:04:38
|
Hello all I am testing fast IrDA communication with a PXA270 CPU using IrNET + ppp stack with a verified reliable partner. PXA270 is running under 2.6.18-rc2 kernel. Normally I can establish a connection exchange data. In most cases I can also break the visible connection, see how the stack switches down to 9600 baud (the PXA270 is doing discovery), then I open the connection again, they negotiate to 4mbps again, re-establish ppp, and everything works again. However, sometimes this doesn't happen. I.e., they see each other at 9600 baud, I can see their records appear in /proc/net/irda/discovery, but they never switch to 4mbps. Another possible scenario is that they do switch to 4mbps, but the connection cannot be esrablished, i.e., no pings or any other IP packets can be exchanged. Interesting also, if I "kill -HUP" the pppd on ANY of the 2 sides, it switches to 4mbps and either re-establishes the connections or hangs at 4mbps further... Here is irdadump output on PXA270 side: 1) normal 9600 discovery: 15:11:41.049381 xid:cmd 1e91de63 > ffffffff S=6 s=2 (14) 15:11:41.169422 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) 15:11:41.250715 xid:rsp 1e91de63 < 39705d32 S=6 s=3 testhost hint=4400 [ Computer LAN Access ] (25) 15:11:41.289461 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) 15:11:41.409389 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) 15:11:41.490720 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) 15:11:41.529394 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) that's all what happens when they hang. 2) good case when they re-establish the connection: 15:11:43.839447 xid:cmd 1e91de63 > ffffffff S=6 s=0 (14) 15:11:43.959380 xid:cmd 1e91de63 > ffffffff S=6 s=1 (14) 15:11:44.079441 xid:cmd 1e91de63 > ffffffff S=6 s=2 (14) 15:11:44.199381 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) 15:11:44.280725 xid:rsp 1e91de63 < 39705d32 S=6 s=3 testhost hint=4400 [ Computer LAN Access ] (25) 15:11:44.319454 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) 15:11:44.439449 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) 15:11:44.520809 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) 15:11:44.559390 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) 15:11:44.753903 snrm:cmd ca=fe pf=1 1e91de63 < 39705d32 new-ca=a8 (33) 15:11:44.754050 ua:rsp ca=a8 pf=1 1e91de63 > 39705d32 (32) 15:11:44.819610 rr:cmd < ca=a8 pf=1 nr=0 (2) 15:11:44.819709 rr:rsp > ca=a8 pf=1 nr=0 (2) 15:11:44.822145 i:cmd < ca=a8 pf=1 nr=0 ns=0 LM slsap=2a dlsap=00 CONN_CMD (6) 15:11:44.822305 i:rsp > ca=a8 pf=1 nr=1 ns=0 LM slsap=00 dlsap=2a CONN_RSP (6) 15:11:44.824754 i:cmd < ca=a8 pf=1 nr=1 ns=1 LM slsap=2a dlsap=00 GET_VALUE_BY_CLASS: "IrNetv1" "IrDA:TinyTP:LsapSel" (33) 15:11:44.824867 i:rsp > ca=a8 pf=1 nr=2 ns=1 LM slsap=00 dlsap=2a GET_VALUE_BY_CLASS: Success Integer: 10 (15) 15:11:44.827461 i:cmd < ca=a8 pf=0 nr=2 ns=2 LM slsap=2a dlsap=00 DISC (6) 15:11:44.827831 i:cmd < ca=a8 pf=1 nr=2 ns=3 LM slsap=2c dlsap=10 CONN_CMD TTP credits=0(12) 15:11:44.827978 i:rsp > ca=a8 pf=1 nr=4 ns=2 LM slsap=10 dlsap=2c CONN_RSP TTP credits=0(12) 15:11:44.830523 i:cmd < ca=a8 pf=1 nr=3 ns=4 LM slsap=2c dlsap=10 TTP credits=0 (25) 15:11:44.830619 rr:rsp > ca=a8 pf=1 nr=5 (2) 15:11:44.832990 rr:cmd < ca=a8 pf=1 nr=3 (2) 3) killing pppd on the "testhost" side: 15:12:53.049399 xid:cmd 1e91de63 > ffffffff S=6 s=0 (14) 15:12:53.169443 xid:cmd 1e91de63 > ffffffff S=6 s=1 (14) 15:12:53.289383 xid:cmd 1e91de63 > ffffffff S=6 s=2 (14) 15:12:53.409437 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) 15:12:53.529441 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) 15:12:53.610855 xid:rsp 1e91de63 < 39705d32 S=6 s=4 testhost hint=4400 [ Computer LAN Access ] (25) 15:12:53.649450 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) 15:12:53.730788 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) 15:12:53.769386 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) (kill -HUP `pidof pppd`) 15:12:55.990454 xid:cmd ffffffff < 39705d32 S=6 s=0 (14) 15:12:56.073363 xid:cmd ffffffff < 39705d32 S=6 s=1 (14) 15:12:56.163314 xid:cmd ffffffff < 39705d32 S=6 s=2 (14) 15:12:56.163412 xid:rsp 1e91de63 > 39705d32 S=6 s=2 testhost hint=4400 [ Computer LAN Access ] (24) 15:12:56.253270 xid:cmd ffffffff < 39705d32 S=6 s=3 (14) 15:12:56.343319 xid:cmd ffffffff < 39705d32 S=6 s=4 (14) 15:12:56.433313 xid:cmd ffffffff < 39705d32 S=6 s=5 (14) 15:12:56.554506 xid:cmd ffffffff < 39705d32 S=6 s=* testhost hint=4400 [ Computer LAN Access ] (25) 15:12:56.585843 snrm:cmd ca=fe pf=1 1e91de63 < 39705d32 new-ca=a8 (33) 15:12:56.585988 ua:rsp ca=a8 pf=1 1e91de63 > 39705d32 (32) 15:13:00.162530 snrm:cmd ca=fe pf=1 1e91de63 < 39705d32 new-ca=a8 (33) 15:13:00.162681 ua:rsp ca=a8 pf=1 1e91de63 > 39705d32 (32) 15:13:00.228203 rr:cmd < ca=a8 pf=1 nr=0 (2) 15:13:00.228303 rr:rsp > ca=a8 pf=1 nr=0 (2) 15:13:00.230692 i:cmd < ca=a8 pf=1 nr=0 ns=0 LM slsap=39 dlsap=00 CONN_CMD (6) 15:13:00.230859 i:rsp > ca=a8 pf=1 nr=1 ns=0 LM slsap=00 dlsap=39 CONN_RSP (6) 15:13:00.233216 i:cmd < ca=a8 pf=1 nr=1 ns=1 LM slsap=39 dlsap=00 GET_VALUE_BY_CLASS: "IrNetv1" "IrDA:TinyTP:LsapSel" (33) 15:13:00.233360 i:rsp > ca=a8 pf=1 nr=2 ns=1 LM slsap=00 dlsap=39 GET_VALUE_BY_CLASS: Success Integer: 10 (15) 15:13:00.235850 i:cmd < ca=a8 pf=0 nr=2 ns=2 LM slsap=39 dlsap=00 DISC (6) 15:13:00.236221 i:cmd < ca=a8 pf=1 nr=2 ns=3 LM slsap=3b dlsap=10 CONN_CMD TTP credits=0(12) 15:13:00.236395 i:rsp > ca=a8 pf=1 nr=4 ns=2 LM slsap=10 dlsap=3b CONN_RSP TTP credits=0(12) 15:13:00.238885 i:cmd < ca=a8 pf=0 nr=3 ns=4 LM slsap=3b dlsap=10 TTP credits=0 (25) 15:13:00.239283 i:cmd < ca=a8 pf=1 nr=3 ns=5 LM slsap=3b dlsap=10 TTP credits=0 (25) 15:13:00.239443 rr:rsp > ca=a8 pf=1 nr=6 (2) 15:13:00.241686 rr:cmd < ca=a8 pf=1 nr=3 (2) 4) killing pppd on the pxa270 side: (kill -HUP `pidof pppd`) 15:22:45.091392 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) 15:22:45.211404 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) 15:22:45.292828 xid:rsp 1e91de63 < 39705d32 S=6 s=4 testhost hint=4400 [ Computer LAN Access ] (25) 15:22:45.331514 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) 15:22:45.412893 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) 15:22:45.451519 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) 15:22:45.975735 snrm:cmd ca=fe pf=1 1e91de63 > 39705d32 new-ca=a8 (33) 15:22:46.084195 ua:rsp ca=a8 pf=1 1e91de63 < 39705d32 (32) 15:22:46.096866 rr:cmd > ca=a8 pf=1 nr=0 (2) 15:22:46.098165 rr:rsp < ca=a8 pf=1 nr=0 (2) 15:22:46.098266 i:cmd > ca=a8 pf=1 nr=0 ns=0 LM slsap=14 dlsap=00 CONN_CMD (6) 15:22:46.100781 i:rsp < ca=a8 pf=1 nr=1 ns=0 LM slsap=00 dlsap=14 CONN_RSP (6) 15:22:46.100909 i:cmd > ca=a8 pf=1 nr=1 ns=1 LM slsap=14 dlsap=00 GET_VALUE_BY_CLASS: "IrNetv1" "IrDA:TinyTP:LsapSel" (33) 15:22:46.103456 i:rsp < ca=a8 pf=1 nr=2 ns=1 LM slsap=00 dlsap=14 GET_VALUE_BY_CLASS: Success Integer: 10 (15) 15:22:46.103662 i:cmd > ca=a8 pf=0 nr=2 ns=2 LM slsap=14 dlsap=00 DISC (6) 15:22:46.104883 i:cmd > ca=a8 pf=1 nr=2 ns=3 LM slsap=16 dlsap=10 CONN_CMD TTP credits=0(12) 15:22:46.106456 i:rsp < ca=a8 pf=1 nr=4 ns=2 LM slsap=10 dlsap=16 CONN_RSP TTP credits=0(12) 15:22:46.106636 i:cmd > ca=a8 pf=1 nr=3 ns=4 LM slsap=16 dlsap=10 TTP credits=0 (25) 15:22:46.109019 rr:rsp < ca=a8 pf=1 nr=5 (2) 15:22:46.109112 rr:cmd > ca=a8 pf=1 nr=3 (2) 15:22:46.111557 i:rsp < ca=a8 pf=1 nr=5 ns=3 LM slsap=10 dlsap=16 TTP credits=1 (13) 15:22:46.111630 rr:cmd > ca=a8 pf=1 nr=4 (2) So, looks like in problem cases nobody just cares to send the "snrm" command like 15:22:45.975735 snrm:cmd ca=fe pf=1 1e91de63 > 39705d32 new-ca=a8 (33) which, probably, is a parameter update command, right? And killing pppd makes the respective host send such a command, which, however, is also not quite successful. What can be the problem? Last kernel known to work was 2.4.21:-) Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany |
From: Guennadi L. <gl...@ds...> - 2007-02-27 13:26:55
|
Hi all, (Sorry, Jean, for adding you to the cc:, I think, most changes to IrDA protocol layers under 2.6 happened as you were the maintainer of the code, if I am wrong, just ignore the email.) The problem I described in the original email is still there and digging through 2.4 and 2.6 IrDA protocol layers will take me alone a LONG time, hence another email. Meanwhile I verified that the problem doesn't exist when two 2.4.21 (heavily patched with backported from 2.5 irda patches from Jean) hosts are tested against each other. The easiest test case is if you put a 2.6 FIr host against a 2.4 partner (2.6 will be the same, I guess, and FIr is not necessary either) start irnet and ppp on both with ppp "persist" option, and "passive" on one of the hosts (2.4, if you are using it) but don't let them see each other for a few (~20) seconds. Let irdadump run on one of the hosts. If they then see each other, the discovery is successful, but no "snrm" frame is sent and, hence, no ppp. If, however, they see each other "soon enough" after ppp startup, the connection is established. Not directly related - shouldn't an event be generated on successful discovery for udev? Is one generated on ethernet plug-in?... Thanks Guennadi On Fri, 23 Feb 2007, Guennadi Liakhovetski wrote: > Hello all > > I am testing fast IrDA communication with a PXA270 CPU using IrNET + ppp > stack with a verified reliable partner. PXA270 is running under 2.6.18-rc2 > kernel. Normally I can establish a connection exchange data. In most cases > I can also break the visible connection, see how the stack switches down > to 9600 baud (the PXA270 is doing discovery), then I open the connection > again, they negotiate to 4mbps again, re-establish ppp, and everything > works again. > > However, sometimes this doesn't happen. I.e., they see each other at 9600 > baud, I can see their records appear in /proc/net/irda/discovery, but they > never switch to 4mbps. > > Another possible scenario is that they do switch to 4mbps, but the > connection cannot be established, i.e., no pings or any other IP packets > can be exchanged. > > Interesting also, if I "kill -HUP" the pppd on ANY of the 2 sides, it > switches to 4mbps and either re-establishes the connections or hangs at > 4mbps further... > > Here is irdadump output on PXA270 side: > > 1) normal 9600 discovery: > > 15:11:41.049381 xid:cmd 1e91de63 > ffffffff S=6 s=2 (14) > 15:11:41.169422 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) > 15:11:41.250715 xid:rsp 1e91de63 < 39705d32 S=6 s=3 testhost hint=4400 [ Computer LAN Access ] (25) > 15:11:41.289461 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) > 15:11:41.409389 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) > 15:11:41.490720 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) > 15:11:41.529394 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) > > that's all what happens when they hang. > > 2) good case when they re-establish the connection: > > 15:11:43.839447 xid:cmd 1e91de63 > ffffffff S=6 s=0 (14) > 15:11:43.959380 xid:cmd 1e91de63 > ffffffff S=6 s=1 (14) > 15:11:44.079441 xid:cmd 1e91de63 > ffffffff S=6 s=2 (14) > 15:11:44.199381 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) > 15:11:44.280725 xid:rsp 1e91de63 < 39705d32 S=6 s=3 testhost hint=4400 [ Computer LAN Access ] (25) > 15:11:44.319454 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) > 15:11:44.439449 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) > 15:11:44.520809 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) > 15:11:44.559390 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) > 15:11:44.753903 snrm:cmd ca=fe pf=1 1e91de63 < 39705d32 new-ca=a8 (33) > 15:11:44.754050 ua:rsp ca=a8 pf=1 1e91de63 > 39705d32 (32) > 15:11:44.819610 rr:cmd < ca=a8 pf=1 nr=0 (2) > 15:11:44.819709 rr:rsp > ca=a8 pf=1 nr=0 (2) > 15:11:44.822145 i:cmd < ca=a8 pf=1 nr=0 ns=0 LM slsap=2a dlsap=00 CONN_CMD (6) > 15:11:44.822305 i:rsp > ca=a8 pf=1 nr=1 ns=0 LM slsap=00 dlsap=2a CONN_RSP (6) > 15:11:44.824754 i:cmd < ca=a8 pf=1 nr=1 ns=1 LM slsap=2a dlsap=00 GET_VALUE_BY_CLASS: "IrNetv1" "IrDA:TinyTP:LsapSel" (33) > 15:11:44.824867 i:rsp > ca=a8 pf=1 nr=2 ns=1 LM slsap=00 dlsap=2a GET_VALUE_BY_CLASS: Success Integer: 10 (15) > 15:11:44.827461 i:cmd < ca=a8 pf=0 nr=2 ns=2 LM slsap=2a dlsap=00 DISC (6) > 15:11:44.827831 i:cmd < ca=a8 pf=1 nr=2 ns=3 LM slsap=2c dlsap=10 CONN_CMD TTP credits=0(12) > 15:11:44.827978 i:rsp > ca=a8 pf=1 nr=4 ns=2 LM slsap=10 dlsap=2c CONN_RSP TTP credits=0(12) > 15:11:44.830523 i:cmd < ca=a8 pf=1 nr=3 ns=4 LM slsap=2c dlsap=10 TTP credits=0 (25) > 15:11:44.830619 rr:rsp > ca=a8 pf=1 nr=5 (2) > 15:11:44.832990 rr:cmd < ca=a8 pf=1 nr=3 (2) > > 3) killing pppd on the "testhost" side: > > 15:12:53.049399 xid:cmd 1e91de63 > ffffffff S=6 s=0 (14) > 15:12:53.169443 xid:cmd 1e91de63 > ffffffff S=6 s=1 (14) > 15:12:53.289383 xid:cmd 1e91de63 > ffffffff S=6 s=2 (14) > 15:12:53.409437 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) > 15:12:53.529441 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) > 15:12:53.610855 xid:rsp 1e91de63 < 39705d32 S=6 s=4 testhost hint=4400 [ Computer LAN Access ] (25) > 15:12:53.649450 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) > 15:12:53.730788 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) > 15:12:53.769386 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) > > (kill -HUP `pidof pppd`) > > 15:12:55.990454 xid:cmd ffffffff < 39705d32 S=6 s=0 (14) > 15:12:56.073363 xid:cmd ffffffff < 39705d32 S=6 s=1 (14) > 15:12:56.163314 xid:cmd ffffffff < 39705d32 S=6 s=2 (14) > 15:12:56.163412 xid:rsp 1e91de63 > 39705d32 S=6 s=2 testhost hint=4400 [ Computer LAN Access ] (24) > 15:12:56.253270 xid:cmd ffffffff < 39705d32 S=6 s=3 (14) > 15:12:56.343319 xid:cmd ffffffff < 39705d32 S=6 s=4 (14) > 15:12:56.433313 xid:cmd ffffffff < 39705d32 S=6 s=5 (14) > 15:12:56.554506 xid:cmd ffffffff < 39705d32 S=6 s=* testhost hint=4400 [ Computer LAN Access ] (25) > 15:12:56.585843 snrm:cmd ca=fe pf=1 1e91de63 < 39705d32 new-ca=a8 (33) > 15:12:56.585988 ua:rsp ca=a8 pf=1 1e91de63 > 39705d32 (32) > 15:13:00.162530 snrm:cmd ca=fe pf=1 1e91de63 < 39705d32 new-ca=a8 (33) > 15:13:00.162681 ua:rsp ca=a8 pf=1 1e91de63 > 39705d32 (32) > 15:13:00.228203 rr:cmd < ca=a8 pf=1 nr=0 (2) > 15:13:00.228303 rr:rsp > ca=a8 pf=1 nr=0 (2) > 15:13:00.230692 i:cmd < ca=a8 pf=1 nr=0 ns=0 LM slsap=39 dlsap=00 CONN_CMD (6) > 15:13:00.230859 i:rsp > ca=a8 pf=1 nr=1 ns=0 LM slsap=00 dlsap=39 CONN_RSP (6) > 15:13:00.233216 i:cmd < ca=a8 pf=1 nr=1 ns=1 LM slsap=39 dlsap=00 GET_VALUE_BY_CLASS: "IrNetv1" "IrDA:TinyTP:LsapSel" (33) > 15:13:00.233360 i:rsp > ca=a8 pf=1 nr=2 ns=1 LM slsap=00 dlsap=39 GET_VALUE_BY_CLASS: Success Integer: 10 (15) > 15:13:00.235850 i:cmd < ca=a8 pf=0 nr=2 ns=2 LM slsap=39 dlsap=00 DISC (6) > 15:13:00.236221 i:cmd < ca=a8 pf=1 nr=2 ns=3 LM slsap=3b dlsap=10 CONN_CMD TTP credits=0(12) > 15:13:00.236395 i:rsp > ca=a8 pf=1 nr=4 ns=2 LM slsap=10 dlsap=3b CONN_RSP TTP credits=0(12) > 15:13:00.238885 i:cmd < ca=a8 pf=0 nr=3 ns=4 LM slsap=3b dlsap=10 TTP credits=0 (25) > 15:13:00.239283 i:cmd < ca=a8 pf=1 nr=3 ns=5 LM slsap=3b dlsap=10 TTP credits=0 (25) > 15:13:00.239443 rr:rsp > ca=a8 pf=1 nr=6 (2) > 15:13:00.241686 rr:cmd < ca=a8 pf=1 nr=3 (2) > > 4) killing pppd on the pxa270 side: > > (kill -HUP `pidof pppd`) > > 15:22:45.091392 xid:cmd 1e91de63 > ffffffff S=6 s=3 (14) > 15:22:45.211404 xid:cmd 1e91de63 > ffffffff S=6 s=4 (14) > 15:22:45.292828 xid:rsp 1e91de63 < 39705d32 S=6 s=4 testhost hint=4400 [ Computer LAN Access ] (25) > 15:22:45.331514 xid:cmd 1e91de63 > ffffffff S=6 s=5 (14) > 15:22:45.412893 xid:rsp 1e91de63 < 39705d32 S=6 s=5 testhost hint=4400 [ Computer LAN Access ] (25) > 15:22:45.451519 xid:cmd 1e91de63 > ffffffff S=6 s=* pxa270 hint=4400 [ Computer LAN Access ] (24) > 15:22:45.975735 snrm:cmd ca=fe pf=1 1e91de63 > 39705d32 new-ca=a8 (33) > 15:22:46.084195 ua:rsp ca=a8 pf=1 1e91de63 < 39705d32 (32) > 15:22:46.096866 rr:cmd > ca=a8 pf=1 nr=0 (2) > 15:22:46.098165 rr:rsp < ca=a8 pf=1 nr=0 (2) > 15:22:46.098266 i:cmd > ca=a8 pf=1 nr=0 ns=0 LM slsap=14 dlsap=00 CONN_CMD (6) > 15:22:46.100781 i:rsp < ca=a8 pf=1 nr=1 ns=0 LM slsap=00 dlsap=14 CONN_RSP (6) > 15:22:46.100909 i:cmd > ca=a8 pf=1 nr=1 ns=1 LM slsap=14 dlsap=00 GET_VALUE_BY_CLASS: "IrNetv1" "IrDA:TinyTP:LsapSel" (33) > 15:22:46.103456 i:rsp < ca=a8 pf=1 nr=2 ns=1 LM slsap=00 dlsap=14 GET_VALUE_BY_CLASS: Success Integer: 10 (15) > 15:22:46.103662 i:cmd > ca=a8 pf=0 nr=2 ns=2 LM slsap=14 dlsap=00 DISC (6) > 15:22:46.104883 i:cmd > ca=a8 pf=1 nr=2 ns=3 LM slsap=16 dlsap=10 CONN_CMD TTP credits=0(12) > 15:22:46.106456 i:rsp < ca=a8 pf=1 nr=4 ns=2 LM slsap=10 dlsap=16 CONN_RSP TTP credits=0(12) > 15:22:46.106636 i:cmd > ca=a8 pf=1 nr=3 ns=4 LM slsap=16 dlsap=10 TTP credits=0 (25) > 15:22:46.109019 rr:rsp < ca=a8 pf=1 nr=5 (2) > 15:22:46.109112 rr:cmd > ca=a8 pf=1 nr=3 (2) > 15:22:46.111557 i:rsp < ca=a8 pf=1 nr=5 ns=3 LM slsap=10 dlsap=16 TTP credits=1 (13) > 15:22:46.111630 rr:cmd > ca=a8 pf=1 nr=4 (2) > > So, looks like in problem cases nobody just cares to send the "snrm" > command like > > 15:22:45.975735 snrm:cmd ca=fe pf=1 1e91de63 > 39705d32 new-ca=a8 (33) > > which, probably, is a parameter update command, right? And killing pppd > makes the respective host send such a command, which, however, is also not > quite successful. > > What can be the problem? > Last kernel known to work was 2.4.21:-) > > Thanks > Guennadi > --------------------------------- > Guennadi Liakhovetski, Ph.D. > DSA Daten- und Systemtechnik GmbH > Pascalstr. 28 > D-52076 Aachen > Germany > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > irda-users mailing list > ird...@li... > http://lists.sourceforge.net/lists/listinfo/irda-users > > --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany |
From: Samuel O. <sa...@so...> - 2007-02-27 14:10:12
|
Hi Guennadi, On Fri, Feb 23, 2007 at 05:04:24PM +0100, Guennadi Liakhovetski wrote: > > So, looks like in problem cases nobody just cares to send the "snrm" > command like > > 15:22:45.975735 snrm:cmd ca=fe pf=1 1e91de63 > 39705d32 new-ca=a8 (33) > > which, probably, is a parameter update command, right? It's an SNRM frame without negociation parameters, which means someone is trying to reset the connection. Makes sense. > And killing pppd > makes the respective host send such a command, which, however, is also not > quite successful. Most likely, yes. > What can be the problem? Hard to tell with only the irdadumps. The usual request: could you send the dmesg or the /var/log/kern.log output with IrDA debug set to 5, during one of those connection reset failures ? Also, irdadump -b would be nice in order to get the raw packet output and avoid any IrDA bug. > Last kernel known to work was 2.4.21:-) Nice, this is going to be fun :-) Cheers, Samuel. > Thanks > Guennadi > --------------------------------- > Guennadi Liakhovetski, Ph.D. > DSA Daten- und Systemtechnik GmbH > Pascalstr. 28 > D-52076 Aachen > Germany > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > irda-users mailing list > ird...@li... > http://lists.sourceforge.net/lists/listinfo/irda-users |
From: Samuel O. <sa...@so...> - 2007-02-27 14:10:12
|
Hi Guennadi, On Fri, Feb 23, 2007 at 05:04:24PM +0100, Guennadi Liakhovetski wrote: > > So, looks like in problem cases nobody just cares to send the "snrm" > command like > > 15:22:45.975735 snrm:cmd ca=fe pf=1 1e91de63 > 39705d32 new-ca=a8 (33) > > which, probably, is a parameter update command, right? It's an SNRM frame without negociation parameters, which means someone is trying to reset the connection. Makes sense. > And killing pppd > makes the respective host send such a command, which, however, is also not > quite successful. Most likely, yes. > What can be the problem? Hard to tell with only the irdadumps. The usual request: could you send the dmesg or the /var/log/kern.log output with IrDA debug set to 5, during one of those connection reset failures ? Also, irdadump -b would be nice in order to get the raw packet output and avoid any IrDA bug. > Last kernel known to work was 2.4.21:-) Nice, this is going to be fun :-) Cheers, Samuel. |
From: Guennadi L. <gl...@ds...> - 2007-02-27 15:11:13
|
On Tue, 27 Feb 2007, Samuel Ortiz wrote: > Hi Guennadi, Hi Samuel, > On Fri, Feb 23, 2007 at 05:04:24PM +0100, Guennadi Liakhovetski wrote: >> >> What can be the problem? > Hard to tell with only the irdadumps. > The usual request: could you send the dmesg or the /var/log/kern.log output > with IrDA debug set to 5, during one of those connection reset failures ? > Also, irdadump -b would be nice in order to get the raw packet output and > avoid any IrDA bug. Attached below. Although, I think, it will disappoint you - it looks quite normal, just that the packets one would expect (SNRM and following connection establishment) is missing. What I also have to add, /etc/ppp/options on the 2.6 side has the following in it: persist lcp-restart 1 # Set the LCP restart interval to N sec lcp-max-configure 60 # send max configs which permanently sends LCP requests and re-opens /dev/irnet every minute. Ok, I think, I know what the problem is. pppd doesn't resume sending LCP requests under 2.6. And the reason is - it sleeps in select on /dev/ppp. Under 2.4 when the connection is broken it gets EOF, wakes up and resumes sending LCPs. Under 2.6 it keeps sleeping in select - it is not woken up. So, I think, irdadump and messages are irrelevant now, still, enclosing below. Single packets broken up for readability, irdadump -x is enclosed instead of requested "-b", as this version doesn't know "-b". As you see, just normal discovery. Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany # cat /var/log/messages user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=2, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=3, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=4, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=5, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=255, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irda_device_set_media_busy(FALSE) user.debug kernel: irlap_do_event(), event = MEDIA_BUSY_TIMER_EXPIRED, state = LAP_NDM user.debug kernel: irlmp_link_discovery_confirm() user.debug kernel: irlmp_add_discovery_log() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: irlmp_discovery_confirm() user.debug kernel: irlmp_discovery_timer_expired() user.debug kernel: irlmp_expire_discoveries() user.debug kernel: irlmp_do_lap_event(), EVENT = LM_LAP_DISCOVERY_REQUEST, STATE = LAP_STANDBY user.debug kernel: irlmp_state_standby() user.debug kernel: irlap_discovery_request(), nslots = 6 user.debug kernel: irlap_do_event(), event = DISCOVERY_REQUEST, state = LAP_NDM user.debug kernel: irlap_send_discovery_xid_frame(), s=0, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=1, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=2, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=3, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=4, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=5, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=255, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irda_device_set_media_busy(FALSE) user.debug kernel: irlap_do_event(), event = MEDIA_BUSY_TIMER_EXPIRED, state = LAP_NDM user.debug kernel: irlmp_link_discovery_confirm() user.debug kernel: irlmp_add_discovery_log() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: irlmp_discovery_confirm() user.debug kernel: irlmp_discovery_timer_expired() user.debug kernel: irlmp_expire_discoveries() user.debug kernel: irlmp_do_lap_event(), EVENT = LM_LAP_DISCOVERY_REQUEST, STATE = LAP_STANDBY user.debug kernel: irlmp_state_standby() user.debug kernel: irlap_discovery_request(), nslots = 6 user.debug kernel: irlap_do_event(), event = DISCOVERY_REQUEST, state = LAP_NDM user.debug kernel: irlap_send_discovery_xid_frame(), s=0, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=1, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=2, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=3, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=4, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=5, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irlap_do_event(), event = SLOT_TIMER_EXPIRED, state = LAP_QUERY user.debug kernel: irda_device_is_receiving() user.debug kernel: irlap_send_discovery_xid_frame(), s=255, S=6, command=1 user.debug kernel: async_wrap_skb(), xbofs=12 user.debug kernel: irda_device_set_media_busy(FALSE) user.debug kernel: irlap_do_event(), event = MEDIA_BUSY_TIMER_EXPIRED, state = LAP_NDM user.debug kernel: irlmp_link_discovery_confirm() user.debug kernel: irlmp_add_discovery_log() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: dequeue_first() user.debug kernel: irlmp_discovery_confirm() # irdadump -x 14:20:22.191893 xid:cmd ffffffff < b216357e S=6 s=0 (14) ff3f017e3516b2ffffffff010000 . ? . ~ 5 . . . . . . . . . 14:20:22.311841 xid:cmd ffffffff < b216357e S=6 s=1 (14) ff3f017e3516b2ffffffff010100 . ? . ~ 5 . . . . . . . . . 14:20:22.431890 xid:cmd ffffffff < b216357e S=6 s=2 (14) ff3f017e3516b2ffffffff010200 . ? . ~ 5 . . . . . . . . . 14:20:22.551839 xid:cmd ffffffff < b216357e S=6 s=3 (14) ff3f017e3516b2ffffffff010300 . ? . ~ 5 . . . . . . . . . 14:20:22.671863 xid:cmd ffffffff < b216357e S=6 s=4 (14) ff3f017e3516b2ffffffff010400 . ? . ~ 5 . . . . . . . . . 14:20:22.671934 xid:rsp 13948dba > b216357e S=6 s=4 FIRAP-223 hint=4400 [ Computer LAN Access ] (25) febf01ba8d94137e3516b2010400440046495241502d323233 . . . . . . . ~ 5 . . . . . D . F I R A P - 2 2 3 14:20:22.791912 xid:cmd ffffffff < b216357e S=6 s=5 (14) ff3f017e3516b2ffffffff010500 . ? . ~ 5 . . . . . . . . . 14:20:22.791989 xid:rsp 13948dba > b216357e S=6 s=5 FIRAP-223 hint=4400 [ Computer LAN Access ] (25) febf01ba8d94137e3516b2010500440046495241502d323233 . . . . . . . ~ 5 . . . . . D . F I R A P - 2 2 3 14:20:22.919177 xid:cmd ffffffff < b216357e S=6 s=* RCU2-214 hint=4400 [ Computer LAN Access ] (24) ff3f017e3516b2ffffffff01ff004400524355322d323134 . ? . ~ 5 . . . . . . . . . D . R C U 2 - 2 1 4 14:20:23.222062 xid:cmd ffffffff < b216357e S=6 s=0 (14) ff3f017e3516b2ffffffff010000 . ? . ~ 5 . . . . . . . . . 14:20:23.342003 xid:cmd ffffffff < b216357e S=6 s=1 (14) ff3f017e3516b2ffffffff010100 . ? . ~ 5 . . . . . . . . . 14:20:23.461944 xid:cmd ffffffff < b216357e S=6 s=2 (14) ff3f017e3516b2ffffffff010200 . ? . ~ 5 . . . . . . . . . 14:20:23.462025 xid:rsp 13948dba > b216357e S=6 s=2 FIRAP-223 hint=4400 [ Computer LAN Access ] (25) febf01ba8d94137e3516b2010200440046495241502d323233 . . . . . . . ~ 5 . . . . . D . F I R A P - 2 2 3 14:20:23.582013 xid:cmd ffffffff < b216357e S=6 s=3 (14) ff3f017e3516b2ffffffff010300 . ? . ~ 5 . . . . . . . . . 14:20:23.702058 xid:cmd ffffffff < b216357e S=6 s=4 (14) ff3f017e3516b2ffffffff010400 . ? . ~ 5 . . . . . . . . . 14:20:23.822103 xid:cmd ffffffff < b216357e S=6 s=5 (14) ff3f017e3516b2ffffffff010500 . ? . ~ 5 . . . . . . . . . 14:20:23.822189 xid:rsp 13948dba > b216357e S=6 s=5 FIRAP-223 hint=4400 [ Computer LAN Access ] (25) febf01ba8d94137e3516b2010500440046495241502d323233 . . . . . . . ~ 5 . . . . . D . F I R A P - 2 2 3 14:20:23.949256 xid:cmd ffffffff < b216357e S=6 s=* RCU2-214 hint=4400 [ Computer LAN Access ] (24) ff3f017e3516b2ffffffff01ff004400524355322d323134 . ? . ~ 5 . . . . . . . . . D . R C U 2 - 2 1 4 14:20:24.252133 xid:cmd ffffffff < b216357e S=6 s=0 (14) ff3f017e3516b2ffffffff010000 . ? . ~ 5 . . . . . . . . . 14:20:24.252218 xid:rsp 13948dba > b216357e S=6 s=0 FIRAP-223 hint=4400 [ Computer LAN Access ] (25) febf01ba8d94137e3516b2010000440046495241502d323233 . . . . . . . ~ 5 . . . . . D . F I R A P - 2 2 3 14:20:24.372075 xid:cmd ffffffff < b216357e S=6 s=1 (14) ff3f017e3516b2ffffffff010100 . ? . ~ 5 . . . . . . . . . 14:20:24.492128 xid:cmd ffffffff < b216357e S=6 s=2 (14) ff3f017e3516b2ffffffff010200 . ? . ~ 5 . . . . . . . . . 14:20:24.612170 xid:cmd ffffffff < b216357e S=6 s=3 (14) ff3f017e3516b2ffffffff010300 . ? . ~ 5 . . . . . . . . . 14:20:24.732216 xid:cmd ffffffff < b216357e S=6 s=4 (14) ff3f017e3516b2ffffffff010400 . ? . ~ 5 . . . . . . . . . 14:20:24.852167 xid:cmd ffffffff < b216357e S=6 s=5 (14) ff3f017e3516b2ffffffff010500 . ? . ~ 5 . . . . . . . . . 14:20:24.852248 xid:rsp 13948dba > b216357e S=6 s=5 FIRAP-223 hint=4400 [ Computer LAN Access ] (25) febf01ba8d94137e3516b2010500440046495241502d323233 . . . . . . . ~ 5 . . . . . D . F I R A P - 2 2 3 14:20:24.979441 xid:cmd ffffffff < b216357e S=6 s=* RCU2-214 hint=4400 [ Computer LAN Access ] (24) ff3f017e3516b2ffffffff01ff004400524355322d323134 . ? . ~ 5 . . . . . . . . . D . R C U 2 - 2 1 4 |
From: Guennadi L. <gl...@ds...> - 2007-02-27 16:39:51
|
On Tue, 27 Feb 2007, Guennadi Liakhovetski wrote: > Ok, I think, I know what the problem is. pppd doesn't resume sending LCP > requests under 2.6. And the reason is - it sleeps in select on /dev/ppp. > Under 2.4 when the connection is broken it gets EOF, wakes up and resumes > sending LCPs. Under 2.6 it keeps sleeping in select - it is not woken up. Yes, I was right. And here's how (diff 2.4.21 and 2.6.18): --- /tmp/irnet_irda.c-2.4.21 2007-02-27 17:34:47.000000000 +0100 +++ /tmp/irnet_irda.c-2.6.18 2007-02-27 17:35:07.000000000 +0100 @@ -1132,10 +1134,15 @@ { if(test_open) { +#ifdef MISSING_PPP_API + /* ppp_unregister_channel() wants a user context, which we + * are guaranteed to NOT have here. What are we supposed + * to do here ? Jean II */ /* If we were connected, cleanup & close the PPP channel, * which will kill pppd (hangup) and the rest */ ppp_unregister_channel(&self->chan); self->ppp_open = 0; +#endif } else { Now, Jean, why did you disable it and how can we _correctly_ fix it? It's 17:38 here, so, will continue tomorrow. Any ideas would be appreciated. Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany |
From: Samuel O. <sa...@so...> - 2007-02-27 17:03:08
|
Hi Guennadi, On Tue, Feb 27, 2007 at 05:39:41PM +0100, Guennadi Liakhovetski wrote: > On Tue, 27 Feb 2007, Guennadi Liakhovetski wrote: > > >Ok, I think, I know what the problem is. pppd doesn't resume sending LCP > >requests under 2.6. And the reason is - it sleeps in select on /dev/ppp. > >Under 2.4 when the connection is broken it gets EOF, wakes up and resumes > >sending LCPs. Under 2.6 it keeps sleeping in select - it is not woken up. > > Yes, I was right. And here's how (diff 2.4.21 and 2.6.18): > > --- /tmp/irnet_irda.c-2.4.21 2007-02-27 17:34:47.000000000 +0100 > +++ /tmp/irnet_irda.c-2.6.18 2007-02-27 17:35:07.000000000 +0100 > @@ -1132,10 +1134,15 @@ > { > if(test_open) > { > +#ifdef MISSING_PPP_API > + /* ppp_unregister_channel() wants a user context, which we > + * are guaranteed to NOT have here. What are we supposed > + * to do here ? Jean II */ > /* If we were connected, cleanup & close the PPP channel, > * which will kill pppd (hangup) and the rest */ > ppp_unregister_channel(&self->chan); > self->ppp_open = 0; > +#endif > } > else > { > > Now, Jean, why did you disable it and how can we _correctly_ fix it? It seems, as the comments says, that it's disabled because ppp_unregister_channel() must be called from process context, while the notify callback will most likely be called from interrupt context. I need to understand if there is a cleaner solution, but scheduling a BH could be helful here. > It's > 17:38 here, so, will continue tomorrow. Any ideas would be appreciated. Same here, I'll try to look at that this evening. Cheers, Samuel. > Thanks > Guennadi > --------------------------------- > Guennadi Liakhovetski, Ph.D. > DSA Daten- und Systemtechnik GmbH > Pascalstr. 28 > D-52076 Aachen > Germany |
From: Jean T. <jt...@hp...> - 2007-02-27 19:10:47
|
On Tue, Feb 27, 2007 at 05:39:41PM +0100, Guennadi Liakhovetski wrote: > On Tue, 27 Feb 2007, Guennadi Liakhovetski wrote: > > >Ok, I think, I know what the problem is. pppd doesn't resume sending LCP > >requests under 2.6. And the reason is - it sleeps in select on /dev/ppp. > >Under 2.4 when the connection is broken it gets EOF, wakes up and resumes > >sending LCPs. Under 2.6 it keeps sleeping in select - it is not woken up. > > Yes, I was right. And here's how (diff 2.4.21 and 2.6.18): > > --- /tmp/irnet_irda.c-2.4.21 2007-02-27 17:34:47.000000000 +0100 > +++ /tmp/irnet_irda.c-2.6.18 2007-02-27 17:35:07.000000000 +0100 > @@ -1132,10 +1134,15 @@ > { > if(test_open) > { > +#ifdef MISSING_PPP_API > + /* ppp_unregister_channel() wants a user context, which we > + * are guaranteed to NOT have here. What are we supposed > + * to do here ? Jean II */ > /* If we were connected, cleanup & close the PPP channel, > * which will kill pppd (hangup) and the rest */ > ppp_unregister_channel(&self->chan); > self->ppp_open = 0; > +#endif > } > else > { > > Now, Jean, why did you disable it and how can we _correctly_ fix it? It's > 17:38 here, so, will continue tomorrow. Any ideas would be appreciated. > > Thanks > Guennadi Hi, Since I had to do this hack, the overall kernel APIs have improved. Now, for such thing, you would use tasklets. This is a rough sketch of how it would work. There is a potential race condition, but I believe that in practice it would never happen (irda is quiet after disconnection, userspace won't do anything unless it receives EOF). static void irnet_ppp_disconnect_process(unsigned long data) { irnet_socket * self = (irnet_socket *) data; if(self->ppp_open && !self->ttp_open && !self->ttp_connect) { ppp_unregister_channel(&self->chan); self->ppp_open = 0; } } static DECLARE_TASKLET(irnet_ppp_disconnect_tasklet, irnet_ppp_disconnect_process, 0); [...] /* Deal with the ppp instance if it's still alive */ if(self->ppp_open) { if(test_open) { /* ppp_unregister_channel() wants a user context, which we * are guaranteed to NOT have here. That's why we use * a tasklet here... Jean II */ irnet_ppp_disconnect_tasklet->data = &self->chan; tasklet_schedule(&wireless_ppp_disconnect_tasklet); } else { Obviously, I will let you deal with this uncompiled/untested code. Good luck ;-) Now, the real question is : will it fix the original problem you are seeing ? I don't know. I would need to look deeper in the code, but as you seem to have a good handle on things, I will let you try... Have fun... Jean |
From: Samuel O. <sa...@so...> - 2007-02-28 00:38:54
|
Hi Jean, On Tue, Feb 27, 2007 at 11:08:50AM -0800, Jean Tourrilhes wrote: > Hi, > > Since I had to do this hack, the overall kernel APIs have > improved. Now, for such thing, you would use tasklets. Yes, as I said, a bottom half could be a solution. As far as I know, tasklets are running from the timer interrupt, and thus don't provide us with a process context. Please correct me if I'm wrong. I think we should use a work queue if we want to be in process context, see my patch below. > This is a rough sketch of how it would work. There is a > potential race condition, but I believe that in practice it would > never happen (irda is quiet after disconnection, userspace won't do > anything unless it receives EOF). I agree, it's theoretically racy, but shouldn't happen in practice. Some locking here could help, but I'd rather stay lockless. > static void irnet_ppp_disconnect_process(unsigned long data) > { > irnet_socket * self = (irnet_socket *) data; > if(self->ppp_open && !self->ttp_open && !self->ttp_connect) > { > ppp_unregister_channel(&self->chan); > self->ppp_open = 0; > } > } So, this one would run in atomic context while ppp_unregister_channel may sleep. So, if Jean agrees with me, Guennadi, could you please try this patch instead: --- diff --git a/net/irda/irnet/irnet.h b/net/irda/irnet/irnet.h index 873ae18..2396d46 100644 --- a/net/irda/irnet/irnet.h +++ b/net/irda/irnet/irnet.h @@ -249,6 +249,7 @@ #include <linux/poll.h> #include <linux/capability.h> #include <linux/ctype.h> /* isspace() */ +#include <linux/workqueue.h> #include <asm/uaccess.h> #include <linux/init.h> @@ -419,6 +420,7 @@ typedef struct irnet_socket u32 raccm; /* to please pppd - dummy) */ unsigned int flags; /* PPP flags (compression, ...) */ unsigned int rbits; /* Unused receive flags ??? */ + struct work_struct disconnect_work; /* ------------------------ IrTTP part ------------------------ */ /* We create a pseudo "socket" over the IrDA tranport */ diff --git a/net/irda/irnet/irnet_irda.c b/net/irda/irnet/irnet_irda.c index c378e66..0e3825d 100644 --- a/net/irda/irnet/irnet_irda.c +++ b/net/irda/irnet/irnet_irda.c @@ -10,6 +10,20 @@ #include "irnet_irda.h" /* Private header */ +static void irnet_ppp_disconnect(struct work_struct *work) +{ + irnet_socket * self = + container_of(work, irnet_socket, disconnect_work); + + if (self == NULL) + return; + + if(self->ppp_open && !self->ttp_open && !self->ttp_connect) { + ppp_unregister_channel(&self->chan); + self->ppp_open = 0; + } +} + /************************* CONTROL CHANNEL *************************/ /* * When ppp is not active, /dev/irnet act as a control channel. @@ -499,6 +513,8 @@ irda_irnet_create(irnet_socket * self) #endif /* DISCOVERY_NOMASK */ self->tx_flow = FLOW_START; /* Flow control from IrTTP */ + INIT_WORK(&self->disconnect_work, irnet_ppp_disconnect); + DEXIT(IRDA_SOCK_TRACE, "\n"); return(0); } @@ -1134,6 +1150,7 @@ irnet_disconnect_indication(void * instance, { if(test_open) { + schedule_work(&self->disconnect_work); #ifdef MISSING_PPP_API /* ppp_unregister_channel() wants a user context, which we * are guaranteed to NOT have here. What are we supposed -- This compiles on my machine, but it's now 1:40 AM, so I'll let you test it on your machines ;-) Cheers, Samuel. |
From: Jean T. <jt...@hp...> - 2007-02-28 01:27:57
|
On Wed, Feb 28, 2007 at 02:38:44AM +0200, Samuel Ortiz wrote: > Hi Jean, > > On Tue, Feb 27, 2007 at 11:08:50AM -0800, Jean Tourrilhes wrote: > > Hi, > > > > Since I had to do this hack, the overall kernel APIs have > > improved. Now, for such thing, you would use tasklets. > > Yes, as I said, a bottom half could be a solution. > As far as I know, tasklets are running from the timer interrupt, and thus > don't provide us with a process context. Please correct me if I'm wrong. > I think we should use a work queue if we want to be in process context, see > my patch below. Actually, you probably know better. Both represent about the same amount of code. > > This is a rough sketch of how it would work. There is a > > potential race condition, but I believe that in practice it would > > never happen (irda is quiet after disconnection, userspace won't do > > anything unless it receives EOF). > > I agree, it's theoretically racy, but shouldn't happen in practice. Some > locking here could help, but I'd rather stay lockless. And that's why I added the test on the various flags. > So, this one would run in atomic context while ppp_unregister_channel may > sleep. So, if Jean agrees with me, Guennadi, could you please try this patch > instead: Look good. Please carry the /* comments */ over from my patch to keep trace of why we did it this way. Have fun... Jean |
From: Guennadi L. <gl...@ds...> - 2007-02-28 08:42:57
|
Hi On Wed, 28 Feb 2007, Samuel Ortiz wrote: > As far as I know, tasklets are running from the timer interrupt, and thus > don't provide us with a process context. Please correct me if I'm wrong. > I think we should use a work queue if we want to be in process context, see > my patch below. Thanks to both of you for patches. And, yes, Samuel, your patch fixes the problem! To be more precise, it would fix it for me if I were using a modern kernel. For my 2.6.18 I had to add 1 more parameter to INIT_WORK. So, I think, you can push it upstream with the following minor nitpick: > + if(self->ppp_open && !self->ttp_open && !self->ttp_connect) { Please, add a space after if:-) And, as Jean already pointed out, please remove this #ifdefed code: > #ifdef MISSING_PPP_API > /* ppp_unregister_channel() wants a user context, which we > * are guaranteed to NOT have here. What are we supposed [...] And fix the comment. > This compiles on my machine, but it's now 1:40 AM, so I'll let you test it > on your machines ;-) Good work and thanks for a lightning-fast help! And I hope the "theretical" race you were talking about is indeed only theoretical - also on -rt kernels... What about the case, for example, if there is a connection between host "A" with ppp option "passive" and no discovery and host "B" without "passive" and active discovery the connection is disturbed, then for some time host "A" sends discovery packets, and host "B" wakes up its pppd, which starts sending "LCP ConfReq" packets. So, you can get an SNRM from host "A" at the same time as you run your workqueue, being preempted after the "if (self->ppp_open && !self->ttp_open && !self->ttp_connect))"... Is this the sort of a race you're talking about or not? If this is impossible, are we indeed safe here? ppp does lots of locking inside ppp_unregister_channel(), but I have no idea if that helps us here. Thanks Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany |
From: Jean T. <jt...@hp...> - 2007-02-28 18:35:27
|
On Wed, Feb 28, 2007 at 09:42:50AM +0100, Guennadi Liakhovetski wrote: > Hi > > On Wed, 28 Feb 2007, Samuel Ortiz wrote: > > >As far as I know, tasklets are running from the timer interrupt, and thus > >don't provide us with a process context. Please correct me if I'm wrong. > >I think we should use a work queue if we want to be in process context, see > >my patch below. > > Thanks to both of you for patches. And, yes, Samuel, your patch fixes the > problem! To be more precise, it would fix it for me if I were using a > modern kernel. For my 2.6.18 I had to add 1 more parameter to INIT_WORK. > So, I think, you can push it upstream with the following minor nitpick: > > >+ if(self->ppp_open && !self->ttp_open && !self->ttp_connect) { > > Please, add a space after if:-) > > And, as Jean already pointed out, please remove this #ifdefed code: > > >#ifdef MISSING_PPP_API > > /* ppp_unregister_channel() wants a user context, which we > > * are guaranteed to NOT have here. What are we supposed > [...] > > And fix the comment. > > >This compiles on my machine, but it's now 1:40 AM, so I'll let you test it > >on your machines ;-) > > Good work and thanks for a lightning-fast help! > > And I hope the "theretical" race you were talking about is indeed only > theoretical - also on -rt kernels... What about the case, for example, if > there is a connection between > > host "A" with ppp option "passive" and no discovery and > host "B" without "passive" and active discovery > > the connection is disturbed, then for some time host "A" sends discovery > packets, and host "B" wakes up its pppd, which starts sending "LCP > ConfReq" packets. So, you can get an SNRM from host "A" at the same time > as you run your workqueue, being preempted after the "if (self->ppp_open > && !self->ttp_open && !self->ttp_connect))"... Is this the sort of a race > you're talking about or not? If this is impossible, are we indeed safe > here? ppp does lots of locking inside ppp_unregister_channel(), but I have > no idea if that helps us here. > > Thanks > Guennadi Things on the IrDA medium are serial and fairly slow happening. There will be a few ms between the IrDA disconnect and another IrDA connect. This overall serialisation of IrDA save us from a lot of concurency problem in the IrDA stack. If you are using multiple dongles, then you might have both event comming together. I believe the worst case is that the IrDA connection will need to be retried. Regards, Jean |