You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(9) |
Feb
(11) |
Mar
(22) |
Apr
(73) |
May
(78) |
Jun
(146) |
Jul
(80) |
Aug
(27) |
Sep
(5) |
Oct
(14) |
Nov
(18) |
Dec
(27) |
2005 |
Jan
(20) |
Feb
(30) |
Mar
(19) |
Apr
(28) |
May
(50) |
Jun
(31) |
Jul
(32) |
Aug
(14) |
Sep
(36) |
Oct
(43) |
Nov
(74) |
Dec
(63) |
2006 |
Jan
(34) |
Feb
(32) |
Mar
(21) |
Apr
(76) |
May
(106) |
Jun
(72) |
Jul
(70) |
Aug
(175) |
Sep
(130) |
Oct
(39) |
Nov
(81) |
Dec
(43) |
2007 |
Jan
(81) |
Feb
(36) |
Mar
(20) |
Apr
(43) |
May
(54) |
Jun
(34) |
Jul
(44) |
Aug
(55) |
Sep
(44) |
Oct
(54) |
Nov
(43) |
Dec
(41) |
2008 |
Jan
(42) |
Feb
(84) |
Mar
(73) |
Apr
(30) |
May
(119) |
Jun
(54) |
Jul
(54) |
Aug
(93) |
Sep
(173) |
Oct
(130) |
Nov
(145) |
Dec
(153) |
2009 |
Jan
(59) |
Feb
(12) |
Mar
(28) |
Apr
(18) |
May
(56) |
Jun
(9) |
Jul
(28) |
Aug
(62) |
Sep
(16) |
Oct
(19) |
Nov
(15) |
Dec
(17) |
2010 |
Jan
(14) |
Feb
(36) |
Mar
(37) |
Apr
(30) |
May
(33) |
Jun
(53) |
Jul
(42) |
Aug
(50) |
Sep
(67) |
Oct
(66) |
Nov
(69) |
Dec
(36) |
2011 |
Jan
(52) |
Feb
(45) |
Mar
(49) |
Apr
(21) |
May
(34) |
Jun
(13) |
Jul
(19) |
Aug
(37) |
Sep
(43) |
Oct
(10) |
Nov
(23) |
Dec
(30) |
2012 |
Jan
(42) |
Feb
(36) |
Mar
(46) |
Apr
(25) |
May
(96) |
Jun
(146) |
Jul
(40) |
Aug
(28) |
Sep
(61) |
Oct
(45) |
Nov
(100) |
Dec
(53) |
2013 |
Jan
(79) |
Feb
(24) |
Mar
(134) |
Apr
(156) |
May
(118) |
Jun
(75) |
Jul
(278) |
Aug
(145) |
Sep
(136) |
Oct
(168) |
Nov
(137) |
Dec
(439) |
2014 |
Jan
(284) |
Feb
(158) |
Mar
(231) |
Apr
(275) |
May
(259) |
Jun
(91) |
Jul
(222) |
Aug
(215) |
Sep
(165) |
Oct
(166) |
Nov
(211) |
Dec
(150) |
2015 |
Jan
(164) |
Feb
(324) |
Mar
(299) |
Apr
(214) |
May
(111) |
Jun
(109) |
Jul
(105) |
Aug
(36) |
Sep
(58) |
Oct
(131) |
Nov
(68) |
Dec
(30) |
2016 |
Jan
(46) |
Feb
(87) |
Mar
(135) |
Apr
(174) |
May
(132) |
Jun
(135) |
Jul
(149) |
Aug
(125) |
Sep
(79) |
Oct
(49) |
Nov
(95) |
Dec
(102) |
2017 |
Jan
(104) |
Feb
(75) |
Mar
(72) |
Apr
(53) |
May
(18) |
Jun
(5) |
Jul
(14) |
Aug
(19) |
Sep
(2) |
Oct
(13) |
Nov
(21) |
Dec
(67) |
2018 |
Jan
(56) |
Feb
(50) |
Mar
(148) |
Apr
(41) |
May
(37) |
Jun
(34) |
Jul
(34) |
Aug
(11) |
Sep
(52) |
Oct
(48) |
Nov
(28) |
Dec
(46) |
2019 |
Jan
(29) |
Feb
(63) |
Mar
(95) |
Apr
(54) |
May
(14) |
Jun
(71) |
Jul
(60) |
Aug
(49) |
Sep
(3) |
Oct
(64) |
Nov
(115) |
Dec
(57) |
2020 |
Jan
(15) |
Feb
(9) |
Mar
(38) |
Apr
(27) |
May
(60) |
Jun
(53) |
Jul
(35) |
Aug
(46) |
Sep
(37) |
Oct
(64) |
Nov
(20) |
Dec
(25) |
2021 |
Jan
(20) |
Feb
(31) |
Mar
(27) |
Apr
(23) |
May
(21) |
Jun
(30) |
Jul
(30) |
Aug
(7) |
Sep
(18) |
Oct
|
Nov
(15) |
Dec
(4) |
2022 |
Jan
(3) |
Feb
(1) |
Mar
(10) |
Apr
|
May
(2) |
Jun
(26) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(9) |
Dec
(2) |
2023 |
Jan
(4) |
Feb
(4) |
Mar
(5) |
Apr
(10) |
May
(29) |
Jun
(17) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(6) |
Mar
|
Apr
(1) |
May
(6) |
Jun
|
Jul
(5) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mark H. <ma...@os...> - 2004-06-10 15:36:42
|
On Thu, 2004-06-10 at 02:41, Ling, Xiaofeng wrote: > Yes, that's problem.The broadcast packet is not received by sender, > maybe we need to copy one to sender self. > how about this patch? (The reture value still need to be resolved) That seemed to work OK. I am curious about how the TIPC bcast works. Does it do a real ethernet multicast or broadcast? If so, why do we need to manually send a copy to ourselves? Thanks, Mark. -- Mark Haverkamp <ma...@os...> |
From: Jon M. <jon...@er...> - 2004-06-10 14:23:12
|
I perceive a wish here to be able to change node addresses _without_ having to reset links, connections etc. Right ? Even this is doable, but it would be more complex to implement, and certainly not anything I would prioritize now. It is also unlikely to go completely without consequences, since applications may retain copies of port identities which all by sudden change, messages may already be on their way when the destination port identity change, causing a connection abortion etc. When we _really_ have a stable kernel module, where inter-zone links work, TCP/IPSEC support and all the other things in the TODO list are done, this is something to consider, but there has to be a need for it. /Jon hek...@ya... wrote: >Hi Jon, > >That is so cool ! > >I've been waiting for this for a long time. So now we can >simply "insmod tipc.o" and later on application can invoke >"tipc-config" (or some IOCTLs ?) to dynamically change >the node address ? Just want to make sure that after the >node address has been set by "tipc-config" it can still >be changed later by "tipc-config" for any time, right ? > >I'm thinking of a scenario where a cluster is formed >dynamically and the node address of an old node can be >changed to avoid conflict with a newly joined "node". > >A side issue is that I found latest CVS has only linux-2.6 >support. Is linux-2.4 support planned to be dropped ? >Anyway I've ported it to linux-2.4 since I have to use 2.4 >for a while. > >Thanks > >Kevin > > >--- Jon Maloy <jon...@er...> wrote: > > >>Hi all, >>I have now checked in my new code for dynamic configuration of TIPC. >>There are more changes than I really appreciate in one single delivery, >>but it seems to work fairly well so far, and I think it was necessary to >>have >>it done. >> >>Major changes: >> >>1: No module parameters anymore, -everything must be done via >> the new user-land tool "tipc-config" that comes with the package. >>2: TIPC can be executed in single-node mode without any initial >> configuration at all. It will use the special node address <0.0.0> >> for single-node use, but this must be set to a real address before >> running in network mode. >>3: A few bugs, primarily related to manager.c, but also to routing >> of named messages were fixed. >> >>tipc-config is far from being perfect yet, and can certainly be improved >>both regarding readability and robustness, but it works well for the >>basic cases, such as setting own node address and enabling/disabling >>interfaces. >>Even the other commands have been tested,and work under normal >>cirumstances. >>Have a look at the command interface, -I have tried to make it as >>comprehensible as possible, but I am very open for improvement >>suggestions. >> >>Enjoy(?) /Jon >> >> >>------------------------------------------------------- >>This SF.Net email is sponsored by: GNOME Foundation >>Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event. >>GNOME Users and Developers European Conference, 28-30th June in Norway >>http://2004/guadec.org >>_______________________________________________ >>TIPC-discussion mailing list >>TIP...@li... >>https://lists.sourceforge.net/lists/listinfo/tipc-discussion >> >> |
From: Jon M. <jon...@er...> - 2004-06-10 13:46:14
|
Regards /jon hek...@ya... wrote: >Hi Jon, > >That is so cool ! > >I've been waiting for this for a long time. So now we can >simply "insmod tipc.o" and later on application can invoke >"tipc-config" (or some IOCTLs ?) to dynamically change >the node address ? Just want to make sure that after the >node address has been set by "tipc-config" it can still >be changed later by "tipc-config" for any time, right ? > Yes, you can change the address at any time. The only limitation now is that when you do that TIPC "forgets" about the activated interfaces, so they will have to be enabled again. This is relatively easy to fix with a patch: let tipc-config read the enabled bearers (a string), keep the string on the stack, and re-apply it automatically on TIPC once the address has been changed. But I had to draw the line somewhere... (You are welcome to fix this, of course) > >I'm thinking of a scenario where a cluster is formed >dynamically and the node address of an old node can be >changed to avoid conflict with a newly joined "node". > >A side issue is that I found latest CVS has only linux-2.6 >support. Is linux-2.4 support planned to be dropped ? >Anyway I've ported it to linux-2.4 since I have to use 2.4 >for a while. > Yes, the 2.4 support has been dropped, since in involves too much extra work to keep it alive. We still try to keep all environment dependencies isolated to a few files, though, so it is a doable task to do this, if you really need it. > >Thanks > >Kevin > > >--- Jon Maloy <jon...@er...> wrote: > > >>Hi all, >>I have now checked in my new code for dynamic configuration of TIPC. >>There are more changes than I really appreciate in one single delivery, >>but it seems to work fairly well so far, and I think it was necessary to >>have >>it done. >> >>Major changes: >> >>1: No module parameters anymore, -everything must be done via >> the new user-land tool "tipc-config" that comes with the package. >>2: TIPC can be executed in single-node mode without any initial >> configuration at all. It will use the special node address <0.0.0> >> for single-node use, but this must be set to a real address before >> running in network mode. >>3: A few bugs, primarily related to manager.c, but also to routing >> of named messages were fixed. >> >>tipc-config is far from being perfect yet, and can certainly be improved >>both regarding readability and robustness, but it works well for the >>basic cases, such as setting own node address and enabling/disabling >>interfaces. >>Even the other commands have been tested,and work under normal >>cirumstances. >>Have a look at the command interface, -I have tried to make it as >>comprehensible as possible, but I am very open for improvement >>suggestions. >> >>Enjoy(?) /Jon >> >> >>------------------------------------------------------- >>This SF.Net email is sponsored by: GNOME Foundation >>Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event. >>GNOME Users and Developers European Conference, 28-30th June in Norway >>http://2004/guadec.org >>_______________________________________________ >>TIPC-discussion mailing list >>TIP...@li... >>https://lists.sourceforge.net/lists/listinfo/tipc-discussion >> >> |
From: Ling, X. <xia...@in...> - 2004-06-10 09:41:24
|
Yes, that's problem.The broadcast packet is not received by sender, maybe we need to copy one to sender self. how about this patch? (The reture value still need to be resolved) Index: sendbcast.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvsroot/tipc/source/unstable/net/tipc/sendbcast.c,v retrieving revision 1.18 diff -u -r1.18 sendbcast.c --- sendbcast.c 2 Jun 2004 21:31:36 -0000 1.18 +++ sendbcast.c 10 Jun 2004 09:38:57 -0000 @@ -240,6 +240,9 @@ } else { bcast_set_timer(16 * RTT); check_bcast_outqueue(); + struct sk_buff *copybuf; + copybuf =3D buf_clone(b); + bcast_port_recv(copybuf); res =3D tipc_bsend_buf(b, &mc_head); } free_mclist(&mc_head); >-----Original Message----- >From: tip...@li...=20 >[mailto:tip...@li...] On Behalf=20 >Of Mark Haverkamp >Sent: 2004=C4=EA6=D4=C210=C8=D5 6:48 >To: tipc >Subject: [Tipc-discussion] REPLICA_NODES set to 1 > >Just for testing, I set REPLICA_NODES to 1 to test out bcast instead of >the replicated mcast. I found that with the replicated sending, that >all the nodes receive the message. But with bcast, all nodes but the >sending node receives. This doesn't sound right to me. The sending >node should also receive bcast messages shouldn't it? > >Mark. > >--=20 >Mark Haverkamp <ma...@os...> > > > >------------------------------------------------------- >This SF.Net email is sponsored by: GNOME Foundation >Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event. >GNOME Users and Developers European Conference, 28-30th June in Norway >http://2004/guadec.org >_______________________________________________ >TIPC-discussion mailing list >TIP...@li... >https://lists.sourceforge.net/lists/listinfo/tipc-discussion > > |
From: Nick Y. <ni...@ho...> - 2004-06-10 02:50:22
|
Greate, I will check it out and use it for my test. hehe :-) Best Regards, Nick >From: Jon Maloy <jon...@er...> >To: tipc <tip...@li...> >Subject: [Tipc-discussion] tipc-config is here >Date: Wed, 09 Jun 2004 19:29:11 -0400 > >Hi all, >I have now checked in my new code for dynamic configuration of TIPC. >There are more changes than I really appreciate in one single delivery, >but it seems to work fairly well so far, and I think it was necessary to >have >it done. > >Major changes: > >1: No module parameters anymore, -everything must be done via > the new user-land tool "tipc-config" that comes with the package. >2: TIPC can be executed in single-node mode without any initial > configuration at all. It will use the special node address <0.0.0> > for single-node use, but this must be set to a real address before > running in network mode. >3: A few bugs, primarily related to manager.c, but also to routing > of named messages were fixed. > >tipc-config is far from being perfect yet, and can certainly be improved >both regarding readability and robustness, but it works well for the >basic cases, such as setting own node address and enabling/disabling >interfaces. >Even the other commands have been tested,and work under normal >cirumstances. >Have a look at the command interface, -I have tried to make it as >comprehensible as possible, but I am very open for improvement >suggestions. > >Enjoy(?) /Jon > > >------------------------------------------------------- >This SF.Net email is sponsored by: GNOME Foundation >Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event. >GNOME Users and Developers European Conference, 28-30th June in Norway >http://2004/guadec.org >_______________________________________________ >TIPC-discussion mailing list >TIP...@li... >https://lists.sourceforge.net/lists/listinfo/tipc-discussion _________________________________________________________________ STOP MORE SPAM with the new MSN 8 and get 2 months FREE* http://join.msn.com/?page=features/junkmail |
From: <hek...@ya...> - 2004-06-09 23:43:40
|
Hi Jon, That is so cool ! I've been waiting for this for a long time. So now we can simply "insmod tipc.o" and later on application can invoke "tipc-config" (or some IOCTLs ?) to dynamically change the node address ? Just want to make sure that after the node address has been set by "tipc-config" it can still be changed later by "tipc-config" for any time, right ? I'm thinking of a scenario where a cluster is formed dynamically and the node address of an old node can be changed to avoid conflict with a newly joined "node". A side issue is that I found latest CVS has only linux-2.6 support. Is linux-2.4 support planned to be dropped ? Anyway I've ported it to linux-2.4 since I have to use 2.4 for a while. Thanks Kevin --- Jon Maloy <jon...@er...> wrote: > Hi all, > I have now checked in my new code for dynamic configuration of TIPC. > There are more changes than I really appreciate in one single delivery, > but it seems to work fairly well so far, and I think it was necessary to > have > it done. > > Major changes: > > 1: No module parameters anymore, -everything must be done via > the new user-land tool "tipc-config" that comes with the package. > 2: TIPC can be executed in single-node mode without any initial > configuration at all. It will use the special node address <0.0.0> > for single-node use, but this must be set to a real address before > running in network mode. > 3: A few bugs, primarily related to manager.c, but also to routing > of named messages were fixed. > > tipc-config is far from being perfect yet, and can certainly be improved > both regarding readability and robustness, but it works well for the > basic cases, such as setting own node address and enabling/disabling > interfaces. > Even the other commands have been tested,and work under normal > cirumstances. > Have a look at the command interface, -I have tried to make it as > comprehensible as possible, but I am very open for improvement > suggestions. > > Enjoy(?) /Jon > > > ------------------------------------------------------- > This SF.Net email is sponsored by: GNOME Foundation > Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event. > GNOME Users and Developers European Conference, 28-30th June in Norway > http://2004/guadec.org > _______________________________________________ > TIPC-discussion mailing list > TIP...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: Jon M. <jon...@er...> - 2004-06-09 23:29:23
|
Hi all, I have now checked in my new code for dynamic configuration of TIPC. There are more changes than I really appreciate in one single delivery, but it seems to work fairly well so far, and I think it was necessary to have it done. Major changes: 1: No module parameters anymore, -everything must be done via the new user-land tool "tipc-config" that comes with the package. 2: TIPC can be executed in single-node mode without any initial configuration at all. It will use the special node address <0.0.0> for single-node use, but this must be set to a real address before running in network mode. 3: A few bugs, primarily related to manager.c, but also to routing of named messages were fixed. tipc-config is far from being perfect yet, and can certainly be improved both regarding readability and robustness, but it works well for the basic cases, such as setting own node address and enabling/disabling interfaces. Even the other commands have been tested,and work under normal cirumstances. Have a look at the command interface, -I have tried to make it as comprehensible as possible, but I am very open for improvement suggestions. Enjoy(?) /Jon |
From: Mark H. <ma...@os...> - 2004-06-09 22:51:08
|
On Wed, 2004-06-09 at 15:47, Mark Haverkamp wrote: > Just for testing, I set REPLICA_NODES to 1 to test out bcast instead of > the replicated mcast. I found that with the replicated sending, that > all the nodes receive the message. But with bcast, all nodes but the > sending node receives. This doesn't sound right to me. The sending > node should also receive bcast messages shouldn't it? > One other thing. When I tried to throw lots of messages at 4 nodes, I got a panic on two of my nodes: Unable to handle kernel NULL pointer dereference at virtual address 00000050 printing eip: f8e6959a *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: tipc CPU: 0 EIP: 0060:[<f8e6959a>] Not tainted EFLAGS: 00010206 (2.6.7-rc2) EIP is at link_send_bcast_proto_msg+0xba/0x480 [tipc] eax: 00000044 ebx: 00000000 ecx: f5c929f0 edx: f3f6f934 esi: f6851ad4 edi: f3043c30 ebp: f3043c7c esp: f3043c10 ds: 007b es: 007b ss: 0068 Process do_pub (pid: 1601, threadinfo=f3042000 task=f3c1ae50) Stack: f5c929f0 f6851b6c 00000040 00000413 00000003 00000000 f7b69c94 f7fa27c4 5800404f 00000000 02800200 13100001 03000000 0000044a 13100001 12100001 00000000 00000000 31687465 00000000 00000000 00000000 00000000 00000000 Call Trace: [<c010614f>] show_stack+0x7f/0xa0 [<c01062fe>] show_registers+0x15e/0x1c0 [<c01064aa>] die+0x9a/0x160 [<c0118946>] do_page_fault+0x2e6/0x5b9 [<c0105dcd>] error_code+0x2d/0x38 [<f8e67b08>] request_retransmit+0x38/0x40 [tipc] [<f8e673a7>] recv_bcast_data+0x1a7/0x2b0 [tipc] [<f8e49f39>] bcast_recv+0xa9/0x160 [tipc] [<f8e51495>] tipc_recv_msg+0x885/0x8a0 [tipc] [<f8e6f439>] recv_msg+0x39/0x50 [tipc] [<c037dca2>] netif_receive_skb+0x172/0x1b0 [<c037dd64>] process_backlog+0x84/0x120 [<c037de80>] net_rx_action+0x80/0x120 [<c0125ff8>] __do_softirq+0xb8/0xc0 [<c0126035>] do_softirq+0x35/0x40 [<c0107d45>] do_IRQ+0x175/0x230 [<c0105cd0>] common_interrupt+0x18/0x20 [<c011f0a6>] copy_mm+0x396/0x530 [<c011fdaf>] copy_process+0x4ff/0xc50 [<c012054d>] do_fork+0x4d/0x1b5 [<c0103b18>] sys_fork+0x38/0x40 [<c0105363>] syscall_call+0x7/0xb Code: 8b 40 0c 8b 40 08 0f c8 25 ff ff 00 00 89 45 a0 8b 4d a4 8b > Mark. -- Mark Haverkamp <ma...@os...> |
From: Mark H. <ma...@os...> - 2004-06-09 22:47:58
|
Just for testing, I set REPLICA_NODES to 1 to test out bcast instead of the replicated mcast. I found that with the replicated sending, that all the nodes receive the message. But with bcast, all nodes but the sending node receives. This doesn't sound right to me. The sending node should also receive bcast messages shouldn't it? Mark. -- Mark Haverkamp <ma...@os...> |
From: Mark H. <ma...@os...> - 2004-06-09 18:05:04
|
On Wed, 2004-06-09 at 10:56, Jon Maloy wrote: > You can check it in. I have still not merged with my code, > so send me a note when it is ready. OK, It's checked in. Mark. -- Mark Haverkamp <ma...@os...> |
From: Jon M. <jon...@er...> - 2004-06-09 17:56:54
|
You can check it in. I have still not merged with my code, so send me a note when it is ready. Thanks /jon Mark Haverkamp wrote: On Tue, 2004-06-08 at 13:21, Mark Haverkamp wrote: On Tue, 2004-06-08 at 11:45, Jon Maloy wrote: I took a little closer look at the recvbcast code, and notice a couple of things: First, the code does consistently use buf_safe_discard() when it seems to be sufficient with buf_discard(). This function is more expensive to use, but should not cause any problems if it were correctly implemented. Unfortunately, it is not. I have forgotten to protect the quarantine queue with a lock, and this may quite well cause havoc in the both this buffer queue and elsewhere. My guess is that the very strange messages we see in the dump in reality are invalid, -maybe a mix of different messages. Otherwise I can not explain the destination port number zero in the messages, which seems impossible if one follows the call chain bcast_port_recv_msg()->nameseq_deliver_msg()-> port_recv_msg()->net_route_msg()->net_route_named_msg(). An extra lock for the quarantine queue is needed, and this will hopefully fix the problem, but buf_safe_discard() should anyway be changed to buf_discard() if there is no particular reason for using it. The code that I was testing had a lock on the quarantine queue. One thing that may be the cause of problems in this case was that I did have page alloc debug turned on after all. It uses a whole page regardless of the allocation size as a debug tool. We may have just run out of pages. I am trying out the test once again without the page alloc debug compiled into the kernel. I still had two of the machines panic about 4 AM today even with page alloc debug compiled out. Here is the code I have been using with the lock for the quarantine list. It should probably be checked in anyway since it does seem to fix the hang that I was seeing. cvs diff -u buf.c Index: buf.c =================================================================== RCS file: /cvsroot/tipc/source/unstable/net/tipc/buf.c,v retrieving revision 1.12 diff -u -r1.12 buf.c --- buf.c 5 May 2004 19:09:03 -0000 1.12 +++ buf.c 9 Jun 2004 17:18:46 -0000 @@ -106,10 +106,13 @@ * queue instead. */ +static spinlock_t qb_lock = SPIN_LOCK_UNLOCKED; void buf_safe_discard(struct sk_buff *buf) { - struct sk_buff *qbuf = quarantine_head; + struct sk_buff *qbuf; + spin_lock_bh(&qb_lock); + qbuf = quarantine_head; while (qbuf) { struct sk_buff *next = buf_next(qbuf); if (buf_busy(qbuf)) @@ -118,6 +121,7 @@ qbuf = next; } quarantine_head = qbuf; + spin_unlock_bh(&qb_lock); if (!buf) return; @@ -126,12 +130,14 @@ return; } buf_set_next(buf, 0); + spin_lock_bh(&qb_lock); if (!quarantine_head) { quarantine_head = quarantine_tail = buf; } else { buf_set_next(quarantine_tail, buf); quarantine_tail = buf; } + spin_unlock_bh(&qb_lock); } void buf_stop(void) ------------------------------------------------------------------------ ---------------------------- Also, how do I interpret the output from tipc when the trouble happens. e.g.: net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0 ):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): |
From: Mark H. <ma...@os...> - 2004-06-09 17:25:01
|
On Tue, 2004-06-08 at 13:21, Mark Haverkamp wrote: > On Tue, 2004-06-08 at 11:45, Jon Maloy wrote: > > I took a little closer look at the recvbcast code, and notice > > a couple of things: > > First, the code does consistently use buf_safe_discard() when it seems > > to be sufficient with buf_discard(). This function is more > > expensive to use, but should not cause any problems if it > > were correctly implemented. > > Unfortunately, it is not. I have forgotten to protect the quarantine > > queue with a lock, and this may quite well cause havoc in the > > both this buffer queue and elsewhere. My guess is that the very > > strange messages we see in the dump in reality are invalid, > > -maybe a mix of different messages. Otherwise I can not > > explain the destination port number zero in the messages, which > > seems impossible if one follows the call chain > > bcast_port_recv_msg()->nameseq_deliver_msg()-> > > port_recv_msg()->net_route_msg()->net_route_named_msg(). > > > > An extra lock for the quarantine queue is needed, and this will hopefully > > fix the problem, but buf_safe_discard() should anyway be changed to > > buf_discard() if there is no particular reason for using it. > > The code that I was testing had a lock on the quarantine queue. One > thing that may be the cause of problems in this case was that I did have > page alloc debug turned on after all. It uses a whole page regardless > of the allocation size as a debug tool. We may have just run out of > pages. I am trying out the test once again without the page alloc debug > compiled into the kernel. I still had two of the machines panic about 4 AM today even with page alloc debug compiled out. Here is the code I have been using with the lock for the quarantine list. It should probably be checked in anyway since it does seem to fix the hang that I was seeing. cvs diff -u buf.c Index: buf.c =================================================================== RCS file: /cvsroot/tipc/source/unstable/net/tipc/buf.c,v retrieving revision 1.12 diff -u -r1.12 buf.c --- buf.c 5 May 2004 19:09:03 -0000 1.12 +++ buf.c 9 Jun 2004 17:18:46 -0000 @@ -106,10 +106,13 @@ * queue instead. */ +static spinlock_t qb_lock = SPIN_LOCK_UNLOCKED; void buf_safe_discard(struct sk_buff *buf) { - struct sk_buff *qbuf = quarantine_head; + struct sk_buff *qbuf; + spin_lock_bh(&qb_lock); + qbuf = quarantine_head; while (qbuf) { struct sk_buff *next = buf_next(qbuf); if (buf_busy(qbuf)) @@ -118,6 +121,7 @@ qbuf = next; } quarantine_head = qbuf; + spin_unlock_bh(&qb_lock); if (!buf) return; @@ -126,12 +130,14 @@ return; } buf_set_next(buf, 0); + spin_lock_bh(&qb_lock); if (!quarantine_head) { quarantine_head = quarantine_tail = buf; } else { buf_set_next(quarantine_tail, buf); quarantine_tail = buf; } + spin_unlock_bh(&qb_lock); } void buf_stop(void) ---------------------------------------------------------------------------------------------------- Also, how do I interpret the output from tipc when the trouble happens. e.g.: net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001008):ORIG(1001008:1678278664)::DEST(1001011:0): -- Mark Haverkamp <ma...@os...> |
From: Mark H. <ma...@os...> - 2004-06-08 20:23:30
|
On Tue, 2004-06-08 at 11:45, Jon Maloy wrote: > I took a little closer look at the recvbcast code, and notice > a couple of things: > First, the code does consistently use buf_safe_discard() when it seems > to be sufficient with buf_discard(). This function is more > expensive to use, but should not cause any problems if it > were correctly implemented. > Unfortunately, it is not. I have forgotten to protect the quarantine > queue with a lock, and this may quite well cause havoc in the > both this buffer queue and elsewhere. My guess is that the very > strange messages we see in the dump in reality are invalid, > -maybe a mix of different messages. Otherwise I can not > explain the destination port number zero in the messages, which > seems impossible if one follows the call chain > bcast_port_recv_msg()->nameseq_deliver_msg()-> > port_recv_msg()->net_route_msg()->net_route_named_msg(). > > An extra lock for the quarantine queue is needed, and this will hopefully > fix the problem, but buf_safe_discard() should anyway be changed to > buf_discard() if there is no particular reason for using it. The code that I was testing had a lock on the quarantine queue. One thing that may be the cause of problems in this case was that I did have page alloc debug turned on after all. It uses a whole page regardless of the allocation size as a debug tool. We may have just run out of pages. I am trying out the test once again without the page alloc debug compiled into the kernel. Mark. > > /jon > > Mark Haverkamp wrote: > > >I ran my 4 node test yesterday with a lock around access to the > >quarantine_head in buf_safe_discard. It didn't hang this time but after > >about 14 hours or so two of the machines got something like this: > > > > > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): > >TIPC: Lost Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A > >TIPC: Lost contact with <1.1.17> > >bad: scheduling while atomic! > >TIPC: Established Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c010538a>] work_resched+0x5/0x16 > > > >Debug: sleeping function called from invalid context at mm/slab.c:1994 > >in_atomic():1, irqs_disabled():0 > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c011e0c9>] __might_sleep+0x99/0xb0 > > [<c014bcdf>] kmem_cache_alloc+0x21f/0x230 > > [<c03786a3>] alloc_skb+0x23/0xf0 > > [<c037795e>] sock_alloc_send_pskb+0xce/0x1f0 > > [<c0377aae>] sock_alloc_send_skb+0x2e/0x40 > > [<c03dfe69>] unix_stream_sendmsg+0x199/0x3f0 > > [<c0374a3d>] sock_aio_write+0xbd/0xe0 > > [<c0165cd7>] do_sync_write+0x87/0xc0 > > [<c0165df9>] vfs_write+0xe9/0x120 > > [<c0165ecf>] sys_write+0x3f/0x60 > > [<c0105363>] syscall_call+0x7/0xb > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c010538a>] work_resched+0x5/0x16 > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c010538a>] work_resched+0x5/0x16 > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c010538a>] work_resched+0x5/0x16 > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c010538a>] work_resched+0x5/0x16 > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c010538a>] work_resched+0x5/0x16 > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c010538a>] work_resched+0x5/0x16 > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c03f95ce>] schedule_timeout+0x6e/0xc0 > > [<c01941c5>] ep_poll+0x135/0x1b0 > > [<c0192e8b>] sys_epoll_wait+0xab/0xb0 > > [<c0105363>] syscall_call+0x7/0xb > > > >bad: scheduling while atomic! > > [<c010618e>] dump_stack+0x1e/0x30 > > [<c03f8d84>] schedule+0x6b4/0x6c0 > > [<c011d0cd>] sys_sched_yield+0x5d/0x90 > > [<c01741c3>] coredump_wait+0x43/0xb0 > > [<c0174398>] do_coredump+0x168/0x271 > > [<c012e1a7>] get_signal_to_deliver+0x287/0x510 > > [<c0105126>] do_signal+0xb6/0xf0 > > [<c01051bb>] do_notify_resume+0x5b/0x5d > > [<c01053ae>] work_notifysig+0x13/0x15 > > > >Kernel panic: Aiee, killing interrupt handler! > >In interrupt handler - not syncing > > > > > >I'm not sure what to make of this. I don't see TIPC on the stack, but > >who knows. I'll try page alloc debug to see if there is some re-using > >of free memory going on. > > > >Mark > > > > -- Mark Haverkamp <ma...@os...> |
From: Jon M. <jon...@er...> - 2004-06-08 18:45:48
|
I took a little closer look at the recvbcast code, and notice a couple of things: First, the code does consistently use buf_safe_discard() when it seems to be sufficient with buf_discard(). This function is more expensive to use, but should not cause any problems if it were correctly implemented. Unfortunately, it is not. I have forgotten to protect the quarantine queue with a lock, and this may quite well cause havoc in the both this buffer queue and elsewhere. My guess is that the very strange messages we see in the dump in reality are invalid, -maybe a mix of different messages. Otherwise I can not explain the destination port number zero in the messages, which seems impossible if one follows the call chain bcast_port_recv_msg()->nameseq_deliver_msg()-> port_recv_msg()->net_route_msg()->net_route_named_msg(). An extra lock for the quarantine queue is needed, and this will hopefully fix the problem, but buf_safe_discard() should anyway be changed to buf_discard() if there is no particular reason for using it. /jon Mark Haverkamp wrote: >I ran my 4 node test yesterday with a lock around access to the >quarantine_head in buf_safe_discard. It didn't hang this time but after >about 14 hours or so two of the machines got something like this: > > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >TIPC: Lost Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A >TIPC: Lost contact with <1.1.17> >bad: scheduling while atomic! >TIPC: Established Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >Debug: sleeping function called from invalid context at mm/slab.c:1994 >in_atomic():1, irqs_disabled():0 > [<c010618e>] dump_stack+0x1e/0x30 > [<c011e0c9>] __might_sleep+0x99/0xb0 > [<c014bcdf>] kmem_cache_alloc+0x21f/0x230 > [<c03786a3>] alloc_skb+0x23/0xf0 > [<c037795e>] sock_alloc_send_pskb+0xce/0x1f0 > [<c0377aae>] sock_alloc_send_skb+0x2e/0x40 > [<c03dfe69>] unix_stream_sendmsg+0x199/0x3f0 > [<c0374a3d>] sock_aio_write+0xbd/0xe0 > [<c0165cd7>] do_sync_write+0x87/0xc0 > [<c0165df9>] vfs_write+0xe9/0x120 > [<c0165ecf>] sys_write+0x3f/0x60 > [<c0105363>] syscall_call+0x7/0xb > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c03f95ce>] schedule_timeout+0x6e/0xc0 > [<c01941c5>] ep_poll+0x135/0x1b0 > [<c0192e8b>] sys_epoll_wait+0xab/0xb0 > [<c0105363>] syscall_call+0x7/0xb > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c011d0cd>] sys_sched_yield+0x5d/0x90 > [<c01741c3>] coredump_wait+0x43/0xb0 > [<c0174398>] do_coredump+0x168/0x271 > [<c012e1a7>] get_signal_to_deliver+0x287/0x510 > [<c0105126>] do_signal+0xb6/0xf0 > [<c01051bb>] do_notify_resume+0x5b/0x5d > [<c01053ae>] work_notifysig+0x13/0x15 > >Kernel panic: Aiee, killing interrupt handler! >In interrupt handler - not syncing > > >I'm not sure what to make of this. I don't see TIPC on the stack, but >who knows. I'll try page alloc debug to see if there is some re-using >of free memory going on. > >Mark > > |
From: Jon M. <jon...@er...> - 2004-06-08 18:00:33
|
The dropped messages are rather confusing. They seem to have destination address <1.1.13>, but have somehow ended up on <1.1.19> according to the dump. Maybe this is ok, since they are multicast messages, but only if they were carried as broadcast messages over the network. I think the correct destination address should be <1.1.0> if that is the case, but I haven't studied the implementation well enough to know how it works here. It is also obvious that net_route_named_msg() in net.c should allow a second lookup even of multicast messages, not only of named messages as it does now, so this is a bug that must be corrected (I will fix it). But I can not see any relation to the crash here. Did anything happen to node <1.1.17>, or is the lost/re-established link a result of the dropped messages ? /Jon Mark Haverkamp wrote: >I ran my 4 node test yesterday with a lock around access to the >quarantine_head in buf_safe_discard. It didn't hang this time but after >about 14 hours or so two of the machines got something like this: > > >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): >TIPC: Lost Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A >TIPC: Lost contact with <1.1.17> >bad: scheduling while atomic! >TIPC: Established Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >Debug: sleeping function called from invalid context at mm/slab.c:1994 >in_atomic():1, irqs_disabled():0 > [<c010618e>] dump_stack+0x1e/0x30 > [<c011e0c9>] __might_sleep+0x99/0xb0 > [<c014bcdf>] kmem_cache_alloc+0x21f/0x230 > [<c03786a3>] alloc_skb+0x23/0xf0 > [<c037795e>] sock_alloc_send_pskb+0xce/0x1f0 > [<c0377aae>] sock_alloc_send_skb+0x2e/0x40 > [<c03dfe69>] unix_stream_sendmsg+0x199/0x3f0 > [<c0374a3d>] sock_aio_write+0xbd/0xe0 > [<c0165cd7>] do_sync_write+0x87/0xc0 > [<c0165df9>] vfs_write+0xe9/0x120 > [<c0165ecf>] sys_write+0x3f/0x60 > [<c0105363>] syscall_call+0x7/0xb > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c010538a>] work_resched+0x5/0x16 > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c03f95ce>] schedule_timeout+0x6e/0xc0 > [<c01941c5>] ep_poll+0x135/0x1b0 > [<c0192e8b>] sys_epoll_wait+0xab/0xb0 > [<c0105363>] syscall_call+0x7/0xb > >bad: scheduling while atomic! > [<c010618e>] dump_stack+0x1e/0x30 > [<c03f8d84>] schedule+0x6b4/0x6c0 > [<c011d0cd>] sys_sched_yield+0x5d/0x90 > [<c01741c3>] coredump_wait+0x43/0xb0 > [<c0174398>] do_coredump+0x168/0x271 > [<c012e1a7>] get_signal_to_deliver+0x287/0x510 > [<c0105126>] do_signal+0xb6/0xf0 > [<c01051bb>] do_notify_resume+0x5b/0x5d > [<c01053ae>] work_notifysig+0x13/0x15 > >Kernel panic: Aiee, killing interrupt handler! >In interrupt handler - not syncing > > >I'm not sure what to make of this. I don't see TIPC on the stack, but >who knows. I'll try page alloc debug to see if there is some re-using >of free memory going on. > >Mark > > |
From: Mark H. <ma...@os...> - 2004-06-08 15:37:05
|
I ran my 4 node test yesterday with a lock around access to the quarantine_head in buf_safe_discard. It didn't hang this time but after about 14 hours or so two of the machines got something like this: net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): TIPC: Lost Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A TIPC: Lost contact with <1.1.17> bad: scheduling while atomic! TIPC: Established Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 Debug: sleeping function called from invalid context at mm/slab.c:1994 in_atomic():1, irqs_disabled():0 [<c010618e>] dump_stack+0x1e/0x30 [<c011e0c9>] __might_sleep+0x99/0xb0 [<c014bcdf>] kmem_cache_alloc+0x21f/0x230 [<c03786a3>] alloc_skb+0x23/0xf0 [<c037795e>] sock_alloc_send_pskb+0xce/0x1f0 [<c0377aae>] sock_alloc_send_skb+0x2e/0x40 [<c03dfe69>] unix_stream_sendmsg+0x199/0x3f0 [<c0374a3d>] sock_aio_write+0xbd/0xe0 [<c0165cd7>] do_sync_write+0x87/0xc0 [<c0165df9>] vfs_write+0xe9/0x120 [<c0165ecf>] sys_write+0x3f/0x60 [<c0105363>] syscall_call+0x7/0xb bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c03f95ce>] schedule_timeout+0x6e/0xc0 [<c01941c5>] ep_poll+0x135/0x1b0 [<c0192e8b>] sys_epoll_wait+0xab/0xb0 [<c0105363>] syscall_call+0x7/0xb bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c011d0cd>] sys_sched_yield+0x5d/0x90 [<c01741c3>] coredump_wait+0x43/0xb0 [<c0174398>] do_coredump+0x168/0x271 [<c012e1a7>] get_signal_to_deliver+0x287/0x510 [<c0105126>] do_signal+0xb6/0xf0 [<c01051bb>] do_notify_resume+0x5b/0x5d [<c01053ae>] work_notifysig+0x13/0x15 Kernel panic: Aiee, killing interrupt handler! In interrupt handler - not syncing I'm not sure what to make of this. I don't see TIPC on the stack, but who knows. I'll try page alloc debug to see if there is some re-using of free memory going on. Mark -- Mark Haverkamp <ma...@os...> |
From: Nick Y. <ni...@ho...> - 2004-06-08 15:02:20
|
Jon and Mark, Thank you very much! I will try it again according to your advises tomorrow! Best Regards, Nick >From: Jon Maloy <jon...@er...> >To: "Yin, Hu" <hu...@in...> >CC: Mark Haverkamp <ma...@os...>, tipc ><tip...@li...> >Subject: [Tipc-discussion] Re: TIPC protocol number! >Date: Tue, 08 Jun 2004 10:47:06 -0400 > >Hi, >There is only one number now, 0x88ca, which has been >registered with IEEE as the official TIPC protocol number. >The two numbers we were using earlier were not properly >registered, and can hence not be used in open environments. >And, as Mark said, the separation between the two is now >done in tipc_recv_msg. > >I don't know about tipcdump, but you should perhaps have >a look at the ethereal module we used in TIPCv1 (attached). >If we refurbish this one, we could make it available at SF. > >Regards /jon > >Yin, Hu wrote: > >>Hello Jon and Mark, >> >>Now I'm trying to enable tipcdump to work on TIPCv2, which developed by >>Ling Xiaofeng and can work on TIPCv1. Through investigating I find maybe >>you have changed the number of TIPC >>configuration protocol from 0x0807 to 0x88ca and the number of TIPC >>message protocol from 0x0807 to 0x0800, >>thus tipcdump cannot scratch the package with different protocol number at >>the same time. And we cannot >>distinguish the TIPC package from the >>general IP package if we use protocol number 0x0800. Is that correct? If >>it was correct how can we solve this problem? Thank you in advance! >> >>Best Regards, >> >>Nick >> > > >/* >***************************************************************************** >** >** @(#) Id: >** @(#) File: packet-tipc.c >** @(#) Subsystem: >** @(#) Date: 2002/08/030 >** @(#) Time: 13:00:00 >** @(#) Revision: 1.0 >** @(#) Author: Martin Peterzon >** >** Copyright (C) 2002 by Ericsson Telecom. All rights reserved. >** >***************************************************************************** > >***************************************************************************** >** 2 HISTORY OF DEVELOPMENT. >***************************************************************************** >** date responsible notes >** -------- -------------------- >------------------------------------------ >** 02/08/30 Martin Peterzon Initial release. >** 03/04/28 Jon Maloy Some terminology changes >*/ > > >/* INSTALL > * Instructions how to install is at the end of this file. */ > > >/* packet-tipc.c > * Routines for TIPC dissection > * Martin Peterzon <ua...@ua...> > * > * $Id: README.developer,v 1.51 2002/03/18 00:20:18 guy Exp $ > * > * Ethereal - Network traffic analyzer > * By Gerald Combs <ge...@et...> > * Copyright 1998 Gerald Combs > * > * This program is free software; you can redistribute it and/or > * modify it under the terms of the GNU General Public License > * as published by the Free Software Foundation; either version 2 > * of the License, or (at your option) any later version. > * > * This program is distributed in the hope that it will be useful, > * but WITHOUT ANY WARRANTY; without even the implied warranty of > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > * GNU General Public License for more details. > * > * You should have received a copy of the GNU General Public License > * along with this program; if not, write to the Free Software > * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, >USA. > */ > > > >#ifdef HAVE_CONFIG_H ># include "config.h" >#endif > >#include <stdio.h> >#include <stdlib.h> >#include <string.h> > >#ifdef HAVE_SYS_TYPES_H ># include <sys/types.h> >#endif > >#ifdef HAVE_NETINET_IN_H ># include <netinet/in.h> >#endif > >#include <glib.h> > >#ifdef NEED_SNPRINTF_H ># include "snprintf.h" >#endif > >#include <epan/packet.h> >#include "etypes.h" > > >/* Initialize the protocol and registered fields */ >static int proto_tipc = -1; >static int hf_tipc_ver = -1; >static int hf_tipc_rr = -1; >static int hf_tipc_imp = -1; >static int hf_tipc_user = -1; >static int hf_tipc_err = -1; >static int hf_tipc_mtype = -1; >static int hf_tipc_octetcount = -1; >static int hf_tipc_ack = -1; >static int hf_tipc_seqno = -1; >static int hf_tipc_originproc = -1; >static int hf_tipc_originport = -1; >static int hf_tipc_destport = -1; >static int hf_tipc_sourceproc = -1; >static int hf_tipc_nametype = -1; >static int hf_tipc_nameid = -1; >static int hf_tipc_destlinks = -1; >static int hf_tipc_nameidupp = -1; >static int hf_tipc_nameidlow = -1; >static int hf_tipc_key = -1; >static int hf_tipc_portref = -1; >static int hf_tipc_pub = -1; >static int hf_tipc_nextsent = -1; >static int hf_tipc_gap = -1; >static int hf_tipc_bearerid = -1; >static int hf_tipc_msgcount = -1; >static int hf_tipc_ols = -1; >static int hf_tipc_msgcountasm = -1; >static int hf_tipc_cfgtype = -1; >static int hf_tipc_cfgzone = -1; >static int hf_tipc_cfgsubnet = -1; >static int hf_tipc_cfgproc = -1; >static int hf_tipc_cfgsysgenid = -1; >static int hf_tipc_cfgbearnn = -1; >static int hf_tipc_hdrsize = -1; >static int hf_tipc_msgsize = -1; >static int hf_tipc_ackllseqno = -1; >static int hf_tipc_llseqno = -1; >static int hf_tipc_prevproc = -1; >static int hf_tipc_actid = -1; >static int hf_tipc_destproc = -1; >static int hf_tipc_portnametype = -1; >static int hf_tipc_portnameinst = -1; >static int hf_tipc_linksel = -1; >static int hf_tipc_probe = -1; >static int hf_tipc_remoteadr = -1; >static int hf_tipc_seqgap = -1; >static int hf_tipc_nextsentpack = -1; >static int hf_tipc_netwid = -1; >static int hf_tipc_linkprio = -1; >static int hf_tipc_linktolerance = -1; >static int hf_tipc_ifname = -1; > >/* Initialize the subtree pointer */ >static gint ett_tipc = -1; > > >/* Table of importances */ >gchar *imp_list[] = { > "Low", > "Normal", > "High", > "Non Rejectable", >}; > >/* Table of users for version 0*/ >gchar *user_list_v0[] = { > "Data", > "", > "", > "", > "Name Manager", > "Connection Manager", > "Link Protocol Handler", > "Changeover Protocol Handler", > "Transport Protocol Handler", > "Segmentation Manager", > "Message Bundler", >}; > >/* Table of users for version 1*/ >gchar *user_list_v1[] = { > "Data", > "Data", > "Data", > "Data", > "", > "", > "", > "", > "Routing Manager", > "Name Distributor", > "Connection Manager", > "Link Protocol", > "", > "Changeover Protocol", > "Segmentation Manager", > "Message Bundler", >}; > >/* Table of errors for version 0 */ >gchar *err_list_v0[] = { > "Ok", > "No Port Name", > "No Remote Port", > "No Remote Processor", > "Destination Overload", > "No Connection", > "Not defined", > "", > "", > "", > "", > "", > "", > "", > "", > "", >}; > >/* Table of errors for version 1 */ >gchar *err_list_v1[] = { > "Ok", > "No Port Name", > "No Remote Port", > "No Remote Processor", > "Destination Overload", > "", > "No Connection", > "Communication Error", > "", > "", > "", > "", > "", > "", > "", > "", >}; > >/* Table of message types, depends on user */ >gchar *mtype_list_v0[18][11] = { > {"Warning >Connected","","","","Publication","","","Duplicate","","","Open"}, > {"Connected","","","","Withdrawal","","","Redirect","","","Closed"}, > {"Named","","","","Bulk publication","","","","Info","",""}, > {"Direct","","","","","","","","","",""}, > {"Overload Warning","","","","","","","","","",""}, > {"","","","","","","","","","",""}, > {"","","","","","","","","","",""}, > {"","","","","","","","","","",""}, > {"","","","","","","","","","",""}, > {"","","","","","","","","","",""}, > {"","","","","","","Reset","","","",""}, > {"","","","","","","Activate","","","",""}, > {"","","","","","","Ack","","","",""}, > {"","","","","","","Reply","","","",""}, > {"","","","","","","Nack","","","",""}, > {"","","","","","","Probe","","","",""}, > {"","","","","","Port Probe","ProbeReply","","","",""}, > {"","","","","","Port Probe Reply","","","","",""}, >}; > >gchar *mtype_list_v1[13][16] = { > {"Connected","Connected","Connected","Connected","","","","","Ext Routing >Table","Publication","Connection Probe","","","Duplicate Message","",""}, > {"","","","","","","","","Local Routing Table","Withdrawal","Connection >Probe Reply","","","Original Message","First Segment",""}, > {"Named","Named","Named","Named","","","","","DP Routing >Table","","","","","Info Message","Segment"}, > {"Direct","Direct","Direct","Named","","","","","Route >Addition","","","","","","",""}, > {"Overload","Overload","Overload","Overload","","","","","Route >Removal","","","","","","",""}, > {"","","","","","","","","","","","","","","",""}, > {"","","","","","","","","","","","","","","",""}, > {"","","","","","","","","","","","","","","",""}, > {"","","","","","","","","","","","","","","",""}, > {"","","","","","","","","","","","","","","",""}, > {"","","","","","","","","","","","Reset Message","","","",""}, > {"","","","","","","","","","","","Activate Message","","","",""}, > {"","","","","","","","","","","","State Message","","","",""}, >}; > > > > > >/* Code that dissects packets of ethernet type (a type definition > of the payload of an ethernet packet, that is which protocol that > is on the next level) 0x0807. This number indicates that it is > an usual TIPC message. */ >static void >dissect_tipc(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree) >{ > > proto_item *ti; > proto_tree *tipc_tree; > gint first_word[4]; > gint version, ipc_user, mtype, pub, hdr_size, msg_size; > char temp[30]; > int i; > int data_start, data_length; > > /* Load the first four bytes in an array. */ > for(i=0; i<4; i++) > first_word[i] = tvb_get_guint8(tvb, i); > > /* Version is set by the first three bits */ > version = (first_word[0] & 0xe0) >> 5; > > if(version == 0) { > ipc_user = first_word[2] & 0x0f; > mtype = first_word[3] & 0x1f; > /* Make sure that there is no access outside name arrays. New values >point > to elements with empty strings */ > if(ipc_user > 10) > ipc_user = 1; > if(mtype > 17) > mtype = 5; > } > > if(version == 1) { > ipc_user = (first_word[0] & 0x1e) >> 1; > hdr_size = (first_word[0] & 0x01) * 32 + ((first_word[1] & 0xe0) >> 3); > msg_size = 65536*(first_word[1] & 0x01) + first_word[2] * 256 + >first_word[3]; > mtype = (tvb_get_guint8(tvb,20) & 0xf0) >> 4; > /* Make sure that there is no access outside name arrays. New values >point > to elements with empty strings */ > if(ipc_user > 15) > ipc_user = 3; > if(mtype > 12) > mtype = 5; > if (hdr_size == 20) > mtype = 0; > } > > /* Make entry in Protocol column on summary display */ > if (check_col(pinfo->cinfo, COL_PROTOCOL)) > col_set_str(pinfo->cinfo, COL_PROTOCOL, "TIPC"); > > /* User and mtype is diplayed in summary window */ > if (check_col(pinfo->cinfo, COL_INFO) && version == 0) { > col_clear(pinfo->cinfo, COL_INFO); > col_add_str(pinfo->cinfo, COL_INFO, user_list_v0[ipc_user]); > col_append_str(pinfo->cinfo, COL_INFO, ", "); > col_append_str(pinfo->cinfo, COL_INFO, mtype_list_v0 [mtype][ipc_user]); > } > if (check_col(pinfo->cinfo, COL_INFO) && version == 1) { > sprintf(temp,"%s %u", mtype_list_v1 >[mtype][ipc_user],tvb_get_ntohs(tvb, 6)); > col_clear(pinfo->cinfo, COL_INFO); > col_add_str(pinfo->cinfo, COL_INFO, user_list_v1[ipc_user]); > col_append_str(pinfo->cinfo, COL_INFO, ", "); > //col_append_str(pinfo->cinfo, COL_INFO, mtype_list_v1 >[mtype][ipc_user]); > col_append_str(pinfo->cinfo, COL_INFO, temp); > } > > /* If "tree" is NULL, not necessary to generate protocol tree items. */ > if (tree) { > > /* create display subtree for the protocol */ > ti = proto_tree_add_item(tree, proto_tipc, tvb, 0, -1, FALSE); > tipc_tree = proto_item_add_subtree(ti, ett_tipc); > > /* This is where the creation of the protocol tree starts. The > specification of the TIPC protocol version 0 can be found in > the document UAB/F-00:101, PA5. To be able to understand the > code, this document is probably necessary. */ > > if( version == 0 ) { > > /* Adding header for version 0 of the protocol. This is equal > for all TIPC messages. Header includes version, reroute > counter, importance, user, error, message type, octet count, > ack, sequence number and origin processor. */ > proto_tree_add_uint(tipc_tree, hf_tipc_ver, tvb, 0, 1, > version); > proto_tree_add_uint(tipc_tree, hf_tipc_rr, tvb, 0, 1, > (first_word[0] & 0x0f) >> 1); > proto_tree_add_string(tipc_tree, hf_tipc_imp, tvb, 2, 1, > imp_list[(first_word[2] & 0x30) >> 4]); > proto_tree_add_string(tipc_tree, hf_tipc_user, tvb, 2, 1, > user_list_v0[ipc_user]); > proto_tree_add_string(tipc_tree, hf_tipc_err, tvb, 3, 1, > err_list_v0[(first_word[3] & 0xe0) >> 5]); > proto_tree_add_string(tipc_tree, hf_tipc_mtype, tvb, 3, 1, > mtype_list_v0 [mtype][ipc_user]); > proto_tree_add_uint(tipc_tree, hf_tipc_octetcount, tvb, 6, 2, > tvb_get_ntohs(tvb, 6)); > proto_tree_add_uint(tipc_tree, hf_tipc_ack, tvb, 8, 2, > tvb_get_ntohs(tvb, 8)); > proto_tree_add_uint(tipc_tree, hf_tipc_seqno, tvb, 10, 2, > tvb_get_ntohs(tvb, 10)); > sprintf(temp, "%d.%d.%d", tvb_get_guint8(tvb, 12), > tvb_get_ntohs(tvb, 13) >> 4, > tvb_get_ntohs(tvb, 14) & 0x0fff); > proto_tree_add_string(tipc_tree, hf_tipc_originproc, tvb, 12, 4, > temp); > > > /* Depending on which TIPC user the packet has the interpretation > will look different. This case statement deals with all > possible packets types. At the end of every case the data_start > integer is set to the position where the data part of the > message starts. */ > > switch(ipc_user) { > case 0: > > /* Create subtree > tp = proto_tree_add_text(tipc_tree, tvb, 16, -1, "Ports", NULL); > tipc_tree_p = proto_item_add_subtree(tp, ett_tipc_ports); */ > > /* All data message with user 0 includes origin port and dest. > port. Depending on message type some more fields might > be included. */ > proto_tree_add_uint(tipc_tree, hf_tipc_originport, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_uint(tipc_tree, hf_tipc_destport, tvb, 20, 4, > tvb_get_ntohl(tvb,20)); > > /* Data message with user 0 and mtype 2 contains source > processor, name type and name identity. */ > if (mtype == 2) { > temp[0] = '\0'; > sprintf(temp, "%d.%d.%d", tvb_get_guint8(tvb, 28), > tvb_get_ntohs(tvb, 29) >> 4, > tvb_get_ntohs(tvb, 30) & 0x0fff); > proto_tree_add_string(tipc_tree, hf_tipc_sourceproc, tvb, 28, 4, > temp); > proto_tree_add_uint(tipc_tree, hf_tipc_nametype, tvb, 32, 4, > tvb_get_ntohl(tvb,32)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameid, tvb, 36, 4, > tvb_get_ntohl(tvb,36)); > data_start = 40; > } > > > /* Data message with user 0 and mtype 3 contains source > processor */ > else if (mtype == 3) { > temp[0] = '\0'; > sprintf(temp, "%d.%d.%d", tvb_get_guint8(tvb, 28), > tvb_get_ntohs(tvb, 29) >> 4, > tvb_get_ntohs(tvb, 30) & 0x0fff); > proto_tree_add_string(tipc_tree, hf_tipc_sourceproc, tvb, 28, 4, > temp); > data_start = 32; > } > > /* Data message with user 0 and mtype 0,1 or 4 contains > no further fields. */ > else { > data_start = 28; > } > > break; > > case 4: > > /* Name manager message with user 4 all contains destination > link selector and the message type (that is not displayed > in the tree. */ > proto_tree_add_uint(tipc_tree, hf_tipc_destlinks, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > > > /* Name manager messages with user 4 and mtype 0 contains > name type, name identity lower, name identity upper, > origin port and key. */ > if (mtype == 0) { > proto_tree_add_uint(tipc_tree, hf_tipc_nametype, tvb, 24, 4, > tvb_get_ntohl(tvb,24)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameidlow, tvb, 28, 4, > tvb_get_ntohl(tvb,28)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameidupp, tvb, 32, 4, > tvb_get_ntohl(tvb,32)); > proto_tree_add_uint(tipc_tree, hf_tipc_originport, tvb, 36, 4, > tvb_get_ntohl(tvb,36)); > proto_tree_add_uint(tipc_tree, hf_tipc_key, tvb, 40, 4, > tvb_get_ntohl(tvb,40)); > data_start = 44; > } > > /* Name manager messages with user 4 and mtype 1 contains > name type, name identity lower and key. */ > else if (mtype == 1) { > proto_tree_add_uint(tipc_tree, hf_tipc_nametype, tvb, 24, 4, > tvb_get_ntohl(tvb,24)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameidlow, tvb, 28, 4, > tvb_get_ntohl(tvb,28)); > proto_tree_add_uint(tipc_tree, hf_tipc_key, tvb, 32, 4, > tvb_get_ntohl(tvb,32)); > data_start = 36; > } > > /* Name manager messages with user 4 and mtype 2 contains > publication (that tells the number of publication to > follow). After this field comes the publications that > contains name type, name identity lower, name identity > upper, port reference and key. */ > else { > pub = tvb_get_ntohl(tvb,24); > proto_tree_add_uint(tipc_tree, hf_tipc_pub, tvb, 24, 4, > pub); > for(i=0; i<pub; i++) { > proto_tree_add_uint(tipc_tree, hf_tipc_nametype, tvb, 28+20*i, 4, > tvb_get_ntohl(tvb,28+20*i)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameidlow, tvb, 32+20*i, 4, > tvb_get_ntohl(tvb,32+20*i)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameidupp, tvb, 36+20*i, 4, > tvb_get_ntohl(tvb,36+20*i)); > proto_tree_add_uint(tipc_tree, hf_tipc_portref, tvb, 40+20*i, 4, > tvb_get_ntohl(tvb,40+20*i)); > proto_tree_add_uint(tipc_tree, hf_tipc_key, tvb, 44+20*i, 4, > tvb_get_ntohl(tvb,44+20*i)); > } > data_start = 28 + i*20; > } > break; > case 5: > > /* Connection manager message with user 5 contains origin > port and destination port. */ > > proto_tree_add_uint(tipc_tree, hf_tipc_originport, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_uint(tipc_tree, hf_tipc_destport, tvb, 20, 4, > tvb_get_ntohl(tvb,20)); > data_start = 28; > break; > > case 6: > > /* Link protocol message with user 6 contains next message > to be sent, number of messages need to be retransmitted > and bearer identification. */ > > proto_tree_add_uint(tipc_tree, hf_tipc_nextsent, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_uint(tipc_tree, hf_tipc_gap, tvb, 20, 4, > tvb_get_ntohl(tvb,20)); > proto_tree_add_uint(tipc_tree, hf_tipc_bearerid, tvb, 27, 1, > tvb_get_guint8(tvb, 23) & 0x3f); > data_start=28; > break; > > case 7: > > /* Changeover protocol message with user 7 contains dest. > link selector, message count and berared id. */ > > proto_tree_add_uint(tipc_tree, hf_tipc_destlinks, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_uint(tipc_tree, hf_tipc_msgcount, tvb, 21, 2, > tvb_get_ntohs(tvb,21)); > proto_tree_add_uint(tipc_tree, hf_tipc_bearerid, tvb, 23, 1, > tvb_get_guint8(tvb, 23) & 0x3f); > data_start=24; > break; > > case 9: > > /* Segmentation manager message with user 9 includes dest. > link selector and original link selector. */ > > proto_tree_add_uint(tipc_tree, hf_tipc_destlinks, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_uint(tipc_tree, hf_tipc_ols, tvb, 23, 1, > tvb_get_guint8(tvb,23) & 0x07); > data_start = 24; > break; > > case 10: > > /* Message assembler message with user 10 includes dest. > link selector and message count. */ > > proto_tree_add_uint(tipc_tree, hf_tipc_destlinks, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_uint(tipc_tree, hf_tipc_msgcountasm, tvb, 20, 4, > tvb_get_ntohl(tvb,20)); > data_start = 24; > break; > > default: > > /* If no valid user can be found, the rest of the packet is > interpreted as data, since it's impossible to interpret > any more. */ > > data_start = 16; > break; > } > > /* The rest of the packet is data */ > data_length = tvb_get_ntohs(tvb, 6) - data_start; > temp[0] = '\0'; > sprintf(temp, "Data (%d bytes)", data_length); > proto_tree_add_text(tree, tvb, data_start, data_length, temp, NULL); > > } > > > /* Documentation for version 1 of the protocol: LMC/JO-01:006 */ > > if(version == 1) { > > /* All packets have the same fields in the first 3 words. > These contains version, user, header size, message size, > acknowledged link level sequence number, link level > sequence number and previous processor. */ > > proto_tree_add_uint(tipc_tree, hf_tipc_ver, tvb, 0, 1, > version); > proto_tree_add_string(tipc_tree, hf_tipc_user, tvb, 0, 1, > user_list_v1[ipc_user]); > proto_tree_add_uint(tipc_tree, hf_tipc_hdrsize, tvb, 0, 2, > hdr_size); > proto_tree_add_uint(tipc_tree, hf_tipc_msgsize, tvb, 1, 3, > msg_size); > proto_tree_add_uint(tipc_tree, hf_tipc_ackllseqno, tvb, 4, 2, > tvb_get_ntohs(tvb, 4)); > proto_tree_add_uint(tipc_tree, hf_tipc_llseqno, tvb, 6, 2, > tvb_get_ntohs(tvb, 6)); > temp[0] = '\0'; > sprintf(temp, "%d.%d.%d", tvb_get_guint8(tvb, 8), > tvb_get_ntohs(tvb, 9) >> 4, > tvb_get_ntohs(tvb, 10) & 0x0fff); > proto_tree_add_string(tipc_tree, hf_tipc_prevproc, tvb, 8, 4, > temp); > data_start = 12; > > /* If there is a data packet the user is 0,1,2 or 3. This also tells > the importance of the data packet. There are three different sizes > of headers for data packets: 20 bytes, 32 bytes and 40 bytes. All > of them contains originating port and destination port. */ > > if(ipc_user <= 3) { > > proto_tree_add_string(tipc_tree, hf_tipc_imp, tvb, 0, 1, > imp_list[ipc_user]); > proto_tree_add_uint(tipc_tree, hf_tipc_originport, tvb, 12, 4, > tvb_get_ntohl(tvb,12)); > proto_tree_add_uint(tipc_tree, hf_tipc_destport, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_string(tipc_tree, >hf_tipc_mtype, tvb, 20, 1, > >mtype_list_v1 [mtype][ipc_user]); > data_start = 20; > > if(hdr_size > 20) { > > /* Data packet of header size 32 and 40 contains message type, > error code, reroute counter, activity identity, originating > processor and destination processor. */ > > proto_tree_add_string(tipc_tree, hf_tipc_err, tvb, 20, 1, > err_list_v1 [tvb_get_guint8(tvb, 20) & 0x0f]); > proto_tree_add_uint(tipc_tree, hf_tipc_rr, tvb, 21, 1, > (tvb_get_guint8(tvb, 21) & 0xf0) >> 4); > proto_tree_add_uint(tipc_tree, hf_tipc_actid, tvb, 21, 3, > (tvb_get_guint8(tvb, 21) & 0x0f) * 65536 + > tvb_get_ntohs(tvb, 22)); > temp[0] = '\0'; > sprintf(temp, "%d.%d.%d", tvb_get_guint8(tvb, 24), > tvb_get_ntohs(tvb, 25) >> 4, > tvb_get_ntohs(tvb, 26) & 0x0fff); > proto_tree_add_string(tipc_tree, hf_tipc_originproc, tvb, 24, 4, > temp); > temp[0] = '\0'; > sprintf(temp, "%d.%d.%d", tvb_get_guint8(tvb, 28), > tvb_get_ntohs(tvb, 29) >> 4, > tvb_get_ntohs(tvb, 30) & 0x0fff); > proto_tree_add_string(tipc_tree, hf_tipc_destproc, tvb, 28, 4, > temp); > data_start = 32; > > if(hdr_size > 32) { > > /* Data packet of header size 40 contains port name type/ > connection level sequence number and port name instance. */ > > proto_tree_add_uint(tipc_tree, hf_tipc_portnametype, tvb, 32, 4, > tvb_get_ntohl(tvb,32)); > proto_tree_add_uint(tipc_tree, hf_tipc_portnameinst, tvb, 36, 4, > tvb_get_ntohl(tvb,36)); > data_start = 40; > > } > > } > > } > > /* For internal protocols in TIPC word 3-6 have the same fields, >althought > which fields that are used depends on which user, and sometimes which > message type, that is in the packet. These fields contain importance, > link selector, message count, probe, bearer id, remote address, >message > type, sequence gap and next sent packet. */ > > else if(ipc_user > 7 && ipc_user < 16) { > > if(ipc_user == 14) > proto_tree_add_string(tipc_tree, hf_tipc_imp, tvb, 12, 1, > imp_list[(tvb_get_guint8(tvb,12) & 0x18) >> 3]); > if(ipc_user == 9 || ipc_user == 13 || ipc_user == 14) > proto_tree_add_uint(tipc_tree, hf_tipc_linksel, tvb, 12, 1, > tvb_get_guint8(tvb,12) & 0x07); > if(ipc_user == 13 || ipc_user == 15) > proto_tree_add_uint(tipc_tree, hf_tipc_msgcount, tvb, 13, 2, > tvb_get_ntohs(tvb,13)); > if(ipc_user == 11 && mtype == 12) > proto_tree_add_uint(tipc_tree, hf_tipc_probe, tvb, 15, 1, > (tvb_get_guint8(tvb,15) & 0x40) >> 6); > if(ipc_user == 11 || ipc_user == 13) > proto_tree_add_uint(tipc_tree, hf_tipc_bearerid, tvb, 15, 1, > (tvb_get_guint8(tvb,15) & 0x38) >> 3); > if(ipc_user == 8) > proto_tree_add_uint(tipc_tree, hf_tipc_remoteadr, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > if(ipc_user == 9){ > unsigned int i = 40; > proto_tree_add_uint(tipc_tree, hf_tipc_pub, tvb, 44, >4,(msg_size-28)/25); > for (;i < msg_size;i+=20) > { > proto_tree_add_uint(tipc_tree, hf_tipc_nametype, tvb, i, 4, > tvb_get_ntohl(tvb,i)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameidlow, tvb, i+4, 4, > tvb_get_ntohl(tvb,i+4)); > proto_tree_add_uint(tipc_tree, hf_tipc_nameidupp, tvb, i+8, 4, > tvb_get_ntohl(tvb,i+8)); > proto_tree_add_uint(tipc_tree, hf_tipc_portref, tvb, i+12, 4, > tvb_get_ntohl(tvb,i+12)); > proto_tree_add_uint(tipc_tree, hf_tipc_key, tvb, i+16, 4, > tvb_get_ntohl(tvb,i+16)); > } > } > if(ipc_user != 15 && ipc_user != 12) > proto_tree_add_string(tipc_tree, hf_tipc_mtype, tvb, 20, 1, > mtype_list_v1 [mtype][ipc_user]); > if(ipc_user == 11 && mtype == 12) > proto_tree_add_uint(tipc_tree, hf_tipc_seqgap, tvb, 20, 2, > tvb_get_ntohs(tvb,20) & 0x0fff); > if(ipc_user == 11){ > proto_tree_add_uint(tipc_tree, hf_tipc_nextsentpack, tvb, 22, 2, > tvb_get_ntohs(tvb,22)); > if(msg_size == 48) // Compatibility. > { > unsigned int i = 0; > char netwid[2] = >{((tvb_get_ntohs(tvb,12)&0xe000)>>13)+'A',0}; > proto_tree_add_string(tipc_tree, hf_tipc_netwid, tvb, 12, 1,netwid); > if (mtype != 12) // RESET or >ACTIVATE msg > { > proto_tree_add_uint(tipc_tree, >hf_tipc_linkprio, tvb, 24, 4, > (tvb_get_ntohl(tvb,24)&0x000f8000)>>15); > proto_tree_add_uint(tipc_tree, >hf_tipc_linktolerance, tvb, 24, 4, > (tvb_get_ntohl(tvb,24)&0x00007fff)); > memset(temp,0,sizeof(temp)); > for (;i<=20;i++) { > temp[i] = >tvb_get_guint8(tvb,28+i); > } > proto_tree_add_string(tipc_tree, hf_tipc_ifname, tvb, 28, 1,temp); > } > } > } > data_start = 28; > } > > /* The rest of the packet is data */ > data_length = msg_size - data_start; > temp[0] = '\0'; > sprintf(temp, "Data (%d bytes)", data_length); > proto_tree_add_text(tree, tvb, data_start, data_length, temp, NULL); > > } > } > >} > > > >/* Code that dissects packets of ethernet type 0x0808, which are used > as Ethernet alive messages. */ > >static void >dissect_tipc_cfg(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree) >{ > proto_item *ti; > proto_tree *tipc_tree; > char msg_type[30]; > > /* Maps message type numbers to explaining strings */ > switch(tvb_get_ntohl(tvb,0)) { > case 226398408: strcpy(msg_type, "Configuration, Broadcast Link >Request"); break; > case 382530916: strcpy(msg_type, "Configuration, Unicast Link Request"); >break; > default: strcpy(msg_type, "Not defined"); break; > } > > /* Make entry in Protocol column on summary display */ > if (check_col(pinfo->cinfo, COL_PROTOCOL)) > col_set_str(pinfo->cinfo, COL_PROTOCOL, "TIPC"); > > /* Make entry in Info column on summary display */ > if (check_col(pinfo->cinfo, COL_INFO)) { > col_clear(pinfo->cinfo, COL_INFO); > col_add_str(pinfo->cinfo, COL_INFO, msg_type); > } > > /* If "tree" is NULL, not necessary to generate protocol tree items. */ > if (tree) { > > /* create display subtree for the protocol */ > ti = proto_tree_add_item(tree, proto_tipc, tvb, 0, 24, FALSE); > tipc_tree = proto_item_add_subtree(ti, ett_tipc); > > /* This is where the creation of the protocol tree starts. The > specification of the IPC protocol version 0 can be found in > the document UAB/F-00:101, PA5. To be able to understand the > code, this document is probably necessary. > > This protocol look similar in version 1. > > An ethernet alive message contains message type, zone, subnet, > processor, system generation id and bearer name no. */ > > proto_tree_add_string(tipc_tree, hf_tipc_cfgtype, tvb, 0, 4, > msg_type); > proto_tree_add_uint(tipc_tree, hf_tipc_cfgzone, tvb, 4, 4, > tvb_get_ntohl(tvb,4)); > proto_tree_add_uint(tipc_tree, hf_tipc_cfgsubnet, tvb, 8, 4, > tvb_get_ntohl(tvb,8)); > proto_tree_add_uint(tipc_tree, hf_tipc_cfgproc, tvb, 12, 4, > tvb_get_ntohl(tvb,12)); > /* Obsolete: > proto_tree_add_uint(tipc_tree, hf_tipc_cfgsysgenid, tvb, 16, 4, > tvb_get_ntohl(tvb,16)); > proto_tree_add_uint(tipc_tree, hf_tipc_cfgbearnn, tvb, 20, 4, > tvb_get_ntohl(tvb,20)); > */ > > } > >} > > >/* Register TIPC with Ethereal */ >void >proto_register_tipc(void) >{ > > static hf_register_info hf[] = { > > { &hf_tipc_ver, > { "Version", "tipc.ver", > FT_UINT8, BASE_DEC, NULL, 0x0, > "TIPC protocol version", HFILL } > }, > > { &hf_tipc_rr, > { "Reroute counter", "tipc.rr", > FT_UINT8, BASE_DEC, NULL, 0x0, > "Reroute counter for NAS", HFILL } > }, > > { &hf_tipc_imp, > { "Importance", "tipc.ump", > FT_STRING, BASE_DEC, NULL, 0x0, > "Importance of data message", HFILL } > }, > > { &hf_tipc_user, > { "TIPC user", "tipc.user", > FT_STRING, BASE_DEC, NULL, 0x0, > "TIPC user", HFILL } > }, > > { &hf_tipc_err, > { "Error code", "tipc.err", > FT_STRING, BASE_DEC, NULL, 0x0, > "Error code for data", HFILL } > }, > > { &hf_tipc_mtype, > { "Message type", "tipc.mtype", > FT_STRING, BASE_DEC, NULL, 0x0, > "Message type", HFILL } > }, > > { &hf_tipc_octetcount, > { "Octet count", "tipc.octetcount", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Number of bytes in message, including the header", HFILL } > }, > > { &hf_tipc_ack, > { "Acknowledged", "tipc.ack", > FT_UINT16, BASE_DEC, NULL, 0x0, > "The value of seq. no. for most recently accepted message", HFILL } > }, > > { &hf_tipc_seqno, > { "Sequence number", "tipc.seqno", > FT_UINT16, BASE_DEC, NULL, 0x0, > "The sequence number of the message", HFILL } > }, > > { &hf_tipc_originproc, > { "Sending processor", "tipc.originproc", > FT_STRING, BASE_DEC, NULL, 0x0, > "Processor ID of the processor on sending link", HFILL } > }, > > { &hf_tipc_originport, > { "Originating port", "tipc.originport", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Port number of the port that is the origin of the message", HFILL } > }, > > { &hf_tipc_destport, > { "Destination port", "tipc.destport", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Port number for the destination port at the receiving link end", HFILL >} > }, > > { &hf_tipc_sourceproc, > { "Source processor", "tipc.sourceproc", > FT_STRING, BASE_DEC, NULL, 0x0, > "Processor ID of the first orgin processor", HFILL } > }, > > { &hf_tipc_nametype, > { "Name Type", "tipc.nametype", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Destination port name type", HFILL } > }, > > { &hf_tipc_nameid, > { "Name instance", "tipc.nameid", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Destination port name instance", HFILL } > }, > > { &hf_tipc_destlinks, > { "Destination link selector","tipc.destlinks", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Destination link selector", HFILL } > }, > > { &hf_tipc_nameidlow, > { "Name instance lower","tipc.nameidlow", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Destination port name sequence, lower limit", HFILL } > }, > > { &hf_tipc_nameidupp, > { "Name instance upper","tipc.nameidupp", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Destination port name sequence, upper limit", HFILL } > }, > > { &hf_tipc_key, > { "Key","tipc.key", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Publication identifier", HFILL } > }, > > { &hf_tipc_portref, > { "Port number", "tipc.portref", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Ephemeral port representing this publication", HFILL } > }, > > { &hf_tipc_pub, > { "Publications", "tipc.pub", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Publications in this message", HFILL } > }, > > { &hf_tipc_nextsent, > { "Next message", "tipc.nextsent", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Next message to be sent that is not a link protocol message", HFILL } > }, > > { &hf_tipc_gap, > { "Need transmission", "tipc.gap", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Number of non-link protocol messages that needs retransmission", HFILL >} > }, > > { &hf_tipc_bearerid, > { "Bearer", "tipc.bearerid", > FT_UINT8, BASE_DEC, NULL, 0x0, > "Uniquely identifies bearer", HFILL } > }, > > { &hf_tipc_msgcount, > { "Message count", "tipc.msgcount", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Message count", HFILL } > }, > > { &hf_tipc_ols, > { "Origin link selector","tipc.ols", > FT_UINT8, BASE_DEC, NULL, 0x0, > "Original link selector", HFILL } > }, > > { &hf_tipc_msgcountasm, > { "Message count", "tipc.msgcountasm", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Number of piggybacked messages", HFILL } > }, > > { &hf_tipc_cfgtype, > { "Message type", "tipc.alitype", > FT_STRING, BASE_DEC, NULL, 0x0, > "Message type", HFILL } > }, > > { &hf_tipc_cfgzone, > { "Zone", "tipc.alizone", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Zone", HFILL } > }, > > { &hf_tipc_cfgsubnet, > { "Subnet", "tipc.alisubnet", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Subnet", HFILL } > }, > > { &hf_tipc_cfgproc, > { "Processor", "tipc.aliproc", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Processor", HFILL } > }, > > { &hf_tipc_cfgsysgenid, > { "System generation","tipc.alisysgenid", > FT_UINT32, BASE_DEC, NULL, 0x0, > "System generation identification", HFILL } > }, > > { &hf_tipc_cfgbearnn, > { "Bearer name","tipc.alibearnn", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Bearers coordinate name", HFILL } > }, > > { &hf_tipc_hdrsize, > { "Header size","tipc.hdrsize", > FT_UINT8, BASE_DEC, NULL, 0x0, > "Size of the header", HFILL } > }, > > { &hf_tipc_msgsize, > { "Message size","tipc.msgsize", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Size of the message", HFILL } > }, > > { &hf_tipc_ackllseqno, > { "Ack link level seqno","tipc.ackllseqno", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Acknowledgement sent by piggy back", HFILL } > }, > > { &hf_tipc_llseqno, > { "Link level seqno","tipc.llseqno", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Sequence number to keep track of message flow", HFILL } > }, > > { &hf_tipc_prevproc, > { "Previous processor", "tipc.prevproc", > FT_STRING, BASE_DEC, NULL, 0x0, > "Processor ID of the last processor visited", HFILL } > }, > > { &hf_tipc_actid, > { "Activity identity", "tipc.actid", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Activity identity", HFILL } > }, > > { &hf_tipc_destproc, > { "Destination processor", "tipc.destproc", > FT_STRING, BASE_DEC, NULL, 0x0, > "Final destination processor for a message", HFILL } > }, > > { &hf_tipc_portnametype, > { "Port name type", "tipc.portnametype", > FT_UINT32, BASE_DEC, NULL, 0x0, > "If no port name message, this is the connection sequence number", HFILL >} > }, > > { &hf_tipc_portnameinst, > { "Port name instance", "tipc.portnameinst", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Port name instance", HFILL } > }, > > { &hf_tipc_linksel, > { "Link selector", "tipc.linksel", > FT_UINT8, BASE_DEC, NULL, 0x0, > "Link selector", HFILL } > }, > > { &hf_tipc_probe, > { "Probe", "tipc.probe", > FT_UINT8, BASE_DEC, NULL, 0x0, > "Instructs the receiving link end to immediatly respond", HFILL } > }, > > { &hf_tipc_remoteadr, > { "Next message", "tipc.remoteadr", > FT_UINT32, BASE_DEC, NULL, 0x0, > "Remote address", HFILL } > }, > > { &hf_tipc_seqgap, > { "Sequence gap", "tipc.seqgap", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Size of gap detected in the senders received packet sequence", HFILL } > }, > > { &hf_tipc_nextsentpack, > { "Next sent packet", "tipc.nextsentpack", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Current state of send queue", HFILL } > }, > { &hf_tipc_netwid, > { "Network", "tipc.netwid", > FT_STRING, BASE_DEC, NULL, 0x0, > "Name of logical network comprising this link", HFILL } > }, > { &hf_tipc_linkprio, > { "Link priority", "tipc.linkprio", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Priority of this link", HFILL } > }, > { &hf_tipc_linktolerance, > { "Link tolerance", "tipc.linktolerance", > FT_UINT16, BASE_DEC, NULL, 0x0, > "Milliseconds of non-contact before link is declared faulty", HFILL } > }, > > { &hf_tipc_ifname, > { "Interface name", "tipc.ifname", > FT_STRING, BASE_DEC, NULL, 0x0, > "Sender's ethernet interface this associated with this network", HFILL } > }, > > }; > >/* Setup protocol subtree array */ > static gint *ett[] = { > &ett_tipc, > }; > >/* Register the protocol name and description */ > proto_tipc = proto_register_protocol("TelORB Inter Process Communication", > "TIPC", "tipc"); > >/* Required function calls to register the header fields and subtrees used >*/ > proto_register_field_array(proto_tipc, hf, array_length(hf)); > proto_register_subtree_array(ett, array_length(ett)); > >} > > >/* Register the connection between ethertype and TIPC. The identification > number for TIPC packets in Ethernet is 0x0807 and 0x0808. These are > defined in etypes.h */ >void >proto_reg_handoff_tipc(void) >{ > dissector_handle_t tipc_handle, tipc_cfg_handle; > > tipc_handle = create_dissector_handle(dissect_tipc, proto_tipc); > tipc_cfg_handle = create_dissector_handle(dissect_tipc_cfg, proto_tipc); > dissector_add("ethertype", ETHERTYPE_TIPC, tipc_handle); > dissector_add("ethertype", ETHERTYPE_TIPC_CFG, tipc_cfg_handle); >} > > >/* To make this dissector work, some modifications have to be done to > other source files. > > In etypes.h, add > > #ifndef ETHERTYPE_TIPC > #define ETHERTYPE_TIPC 0x0807 > #endif > > #ifndef ETHERTYPE_TIPC_CFG > #define ETHERTYPE_TIPC_CFG 0x0808 > #endif > > that tells ethereal which port numbers that are associated with TIPC. > > In packet-ethertype.c, add > > {ETHERTYPE_TIPC, "TIPC"}, > > to the array named value_string etype_vals[]. This tells Ethereal the > name of the protocol. > > Also the filename packet-tipc.c should be added to the DISSECTOR_SOURCE > macro in 'Makefile.am' and 'Makefile.nmake'. > >*/ > _________________________________________________________________ MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*. http://join.msn.com/?page=features/virus |
From: Jon M. <jon...@er...> - 2004-06-08 14:47:33
|
Hi, There is only one number now, 0x88ca, which has been registered with IEEE as the official TIPC protocol number. The two numbers we were using earlier were not properly registered, and can hence not be used in open environments. And, as Mark said, the separation between the two is now done in tipc_recv_msg. I don't know about tipcdump, but you should perhaps have a look at the ethereal module we used in TIPCv1 (attached). If we refurbish this one, we could make it available at SF. Regards /jon Yin, Hu wrote: >Hello Jon and Mark, > >Now I'm trying to enable tipcdump to work on TIPCv2, which developed by >Ling Xiaofeng and can work on TIPCv1. >Through investigating I find maybe you have changed the number of TIPC >configuration protocol from 0x0807 >to 0x88ca and the number of TIPC message protocol from 0x0807 to 0x0800, >thus tipcdump cannot scratch the >package with different protocol number at the same time. And we cannot >distinguish the TIPC package from the >general IP package if we use protocol number 0x0800. Is that correct? If >it was correct how can we solve this >problem? Thank you in advance! > >Best Regards, > >Nick > > |
From: Mark H. <ma...@os...> - 2004-06-08 14:24:23
|
On Tue, 2004-06-08 at 01:19, Yin, Hu wrote: > Hello Jon and Mark, > > Now I'm trying to enable tipcdump to work on TIPCv2, which developed by > Ling Xiaofeng and can work on TIPCv1. > Through investigating I find maybe you have changed the number of TIPC > configuration protocol from 0x0807 > to 0x88ca and the number of TIPC message protocol from 0x0807 to 0x0800, > thus tipcdump cannot scratch the > package with different protocol number at the same time. And we cannot > distinguish the TIPC package from the > general IP package if we use protocol number 0x0800. Is that correct? If > it was correct how can we solve this > problem? Thank you in advance! > I believe that the tipc protocol number is 88ca for both protocol and data messages. The message function is distinguished in tipc_recv_msg. Mark. -- Mark Haverkamp <ma...@os...> |
From: Yin, H. <hu...@in...> - 2004-06-08 08:20:11
|
Hello Jon and Mark, Now I'm trying to enable tipcdump to work on TIPCv2, which developed by Ling Xiaofeng and can work on TIPCv1.=20 Through investigating I find maybe you have changed the number of TIPC configuration protocol from 0x0807=20 to 0x88ca and the number of TIPC message protocol from 0x0807 to 0x0800, thus tipcdump cannot scratch the=20 package with different protocol number at the same time. And we cannot distinguish the TIPC package from the general IP package if we use protocol number 0x0800. Is that correct? If it was correct how can we solve this=20 problem? Thank you in advance! Best Regards, Nick=20 |
From: Yin, H. <hu...@in...> - 2004-06-08 01:08:11
|
Jon, Yes. As you have mentioned Network Subscription could be sub-case of Name Subscription, that is, type equals to 0. Thanks, Nick -----Original Message----- From: Jon Maloy [mailto:jon...@er...]=20 Sent: Monday, June 07, 2004 9:03 PM To: Yin, Hu Cc: tipc Subject: Re: [Tipc-discussion] Difference between Name subscription and Network subscription The spec was written for the TIPC-1.2 line of code, where there was a separate code for Network Subscriptions. I later realized that this could be done a sub-case of name subscriptions, as is described under 1), so the network subscriptions are gone in the 1.3-code. Regards /jon Yin, Hu wrote: >Hi, All > >I cannot distinguish Name subscription and Network subscription. Who can >tell me=20 >the difference between Name subscription and Network subscription? Thank >you in advance! > >Here is the description abstracted from TIPC's testspec: > >----------------------------------------------------------------------- - >-------------- >Name subscription Test Program >------------------------------ >1) Open several subscriptons for <0,0,0xffffffff> from one socket. > socket. This will in practice work like a network subscription, > and one should receive an event per subscription each time a=20 > processor goes down or comes back. Disable/enable the bearers > on a processor to verify. >2) Call the syncronous NAME_AVAILABILITY for one name that does > exist and for one that does not. > > =20 > >Network Subscription Test Program >--------------------------------- >Open a subscripton for <0> from a socket. Disable/enable a processor. >----------------------------------------------------------------------- - >------------- > >Thanks, > >Nick > > >------------------------------------------------------- >This SF.Net email is sponsored by the new InstallShield X. >From Windows to Linux, servers to mobile, InstallShield X is the one >installation-authoring solution that does it all. Learn more and >evaluate today! http://www.installshield.com/Dev2Dev/0504 >_______________________________________________ >TIPC-discussion mailing list >TIP...@li... >https://lists.sourceforge.net/lists/listinfo/tipc-discussion > =20 > |
From: Jon M. <jon...@er...> - 2004-06-07 19:10:05
|
That would be great. I am still spending my time on the command interface, and don't have any time for debugging right now. /jon Mark Haverkamp wrote: On Mon, 2004-06-07 at 09:58, Mark Haverkamp wrote: I have code running on 4 nodes using multicast to distribute messages between the nodes. After some hours of sending/and receiving one or more of my nodes will hang. The last time 3 of 4 machines were hung and I was able to get a dump from one of them. This one seems to indicate that there may be a spin lock deadlock in buf_safe_discard. It shows up twice in this stack dump. It looks like the first buf_safe_discard gets interrupted while holding the lock. The second buf_safe_discard seems to be called from link_recv_proto_msg (the address pointed to in tipc_recv_msg is just after the call to link_recv_proto_msg. After talking with Daniel, this is probably not a dead lock. Since the spin_lock_bh stops the softirqs. Anyway, I'll try to put in some debug code in the buf_safe_discard/buf_discard functions to try to narrow it down. Mark. |
From: Mark H. <ma...@os...> - 2004-06-07 17:40:57
|
On Mon, 2004-06-07 at 09:58, Mark Haverkamp wrote: > I have code running on 4 nodes using multicast to distribute messages > between the nodes. After some hours of sending/and receiving one or > more of my nodes will hang. The last time 3 of 4 machines were hung and > I was able to get a dump from one of them. This one seems to indicate > that there may be a spin lock deadlock in buf_safe_discard. It shows up > twice in this stack dump. It looks like the first buf_safe_discard gets > interrupted while holding the lock. The second buf_safe_discard seems > to be called from link_recv_proto_msg (the address pointed to in > tipc_recv_msg is just after the call to link_recv_proto_msg. After talking with Daniel, this is probably not a dead lock. Since the spin_lock_bh stops the softirqs. Anyway, I'll try to put in some debug code in the buf_safe_discard/buf_discard functions to try to narrow it down. Mark. -- Mark Haverkamp <ma...@os...> |
From: Mark H. <ma...@os...> - 2004-06-07 16:58:34
|
I have code running on 4 nodes using multicast to distribute messages between the nodes. After some hours of sending/and receiving one or more of my nodes will hang. The last time 3 of 4 machines were hung and I was able to get a dump from one of them. This one seems to indicate that there may be a spin lock deadlock in buf_safe_discard. It shows up twice in this stack dump. It looks like the first buf_safe_discard gets interrupted while holding the lock. The second buf_safe_discard seems to be called from link_recv_proto_msg (the address pointed to in tipc_recv_msg is just after the call to link_recv_proto_msg. SysRq : Show Regs Pid: 1599, comm: event_server EIP: 0060:[<f8e69cdf>] CPU: 0 EIP is at buf_safe_discard+0x6f/0x270 [tipc] EFLAGS: 00000246 Not tainted (2.6.7-rc2) EAX: ef329bf8 EBX: ef328f50 ECX: 0b6b03c9 EDX: 00000000 ESI: ef328f94 EDI: ef326f50 EBP: efb9db48 DS: 007b ES: 007b CR0: 8005003b CR2: 4206f5e0 CR3: 35326000 CR4: 000006c0 [<c01032d5>] show_regs+0x145/0x170 [<c026b541>] __handle_sysrq+0x71/0x100 [<c02824bc>] receive_chars+0x12c/0x280 [<c02829c6>] serial8250_interrupt+0x176/0x1d0 [<c010785b>] handle_IRQ_event+0x3b/0x70 [<c0107cc1>] do_IRQ+0xe1/0x230 [<c0105cd0>] common_interrupt+0x18/0x20 [<f8e50398>] tipc_recv_msg+0x788/0x8a0 [tipc] [<f8e6e2f9>] recv_msg+0x39/0x50 [tipc] [<c037e052>] netif_receive_skb+0x172/0x1b0 [<c037e114>] process_backlog+0x84/0x120 [<c037e230>] net_rx_action+0x80/0x120 [<c0126068>] __do_softirq+0xb8/0xc0 [<c01260a5>] do_softirq+0x35/0x40 [<f8e69d23>] buf_safe_discard+0xb3/0x270 [tipc] [<f8e66723>] nameseq_deliver+0x83/0x420 [tipc] [<f8e66ced>] bcast_port_recv+0x4d/0x80 [tipc] [<f8e67e65>] tipc_forward_buf2nameseq+0x1c5/0x270 [tipc] [<f8e681eb>] tipc_multicast+0x2db/0x4e0 [tipc] [<f8e6b20a>] send_msg+0x18a/0x210 [tipc] [<c03747ce>] sock_sendmsg+0x8e/0xb0 [<c0375dc1>] sys_sendto+0xe1/0x100 [<c03766ba>] sys_socketcall+0x17a/0x240 [<c0105363>] syscall_call+0x7/0xb -- Mark Haverkamp <ma...@os...> |
From: Jon M. <jon...@er...> - 2004-06-07 13:04:01
|
The spec was written for the TIPC-1.2 line of code, where there was a separate code for Network Subscriptions. I later realized that this could be done a sub-case of name subscriptions, as is described under 1), so the network subscriptions are gone in the 1.3-code. Regards /jon Yin, Hu wrote: >Hi, All > >I cannot distinguish Name subscription and Network subscription. Who can >tell me >the difference between Name subscription and Network subscription? Thank >you in advance! > >Here is the description abstracted from TIPC's testspec: > >------------------------------------------------------------------------ >-------------- >Name subscription Test Program >------------------------------ >1) Open several subscriptons for <0,0,0xffffffff> from one socket. > socket. This will in practice work like a network subscription, > and one should receive an event per subscription each time a > processor goes down or comes back. Disable/enable the bearers > on a processor to verify. >2) Call the syncronous NAME_AVAILABILITY for one name that does > exist and for one that does not. > > > >Network Subscription Test Program >--------------------------------- >Open a subscripton for <0> from a socket. Disable/enable a processor. >------------------------------------------------------------------------ >------------- > >Thanks, > >Nick > > >------------------------------------------------------- >This SF.Net email is sponsored by the new InstallShield X. >From Windows to Linux, servers to mobile, InstallShield X is the one >installation-authoring solution that does it all. Learn more and >evaluate today! http://www.installshield.com/Dev2Dev/0504 >_______________________________________________ >TIPC-discussion mailing list >TIP...@li... >https://lists.sourceforge.net/lists/listinfo/tipc-discussion > > |