From: jason <huz...@gm...> - 2011-03-24 14:54:01
|
Hi All, When I studying tipc-1.7.6 code. I see a potaintial issue that when a user thread which calls send_msg() and is scheduled out because port congestion, it may not be wake up any more by tipc_recv_msg(). Let's see send_msg related pseudo code: 01 add self to link.sleep_queue 02 if port.congest == 1 { 03 schedule(); 04 } and see tipc_recv_msg related code: 01 if link.sleep_queue empty { 02 port.congest = 0; 03 wakeup; 04 } Say we are SMP. Right after CPU0 runs at tipc_recv_msg related code line 01 and see link.sleep_queue IS empty, CPU1 execute send_msg related code from line 01 to 04. In this conditon, send_msg will be schedule out by CPU1 and CPU0 loss the opptunity to wake it up. If there is no incoming message any more, then send_msg thread will not be waken any more. If my analysis is right, I think one way to solve whis issue is to swap tipc_recv_msg related code line 01 and 02, and smp_wmb should also be placed between them. But real code change is not as easy as a code line swapping. Thanks! -- Yours, jason |
From: Stephens, A. <all...@wi...> - 2011-03-24 19:17:11
|
Hi Jason: I don't think the problem you have identified can actually occur because TIPC's link "aliveness" checking ensures that tipc_recv_msg() gets invoked at regular intervals. (That is, a link endpoint must periodically receive either payload messages or LINK_PROTOCOL messages from its peer endpoint, or else the link is declared dead and gets reset.) So even if a sending thread adds its port to a link's list of waiting ports after tipc_recv_msg() has marked the link as non-congested and then sleeps, the next invocation of tipc_recv_msg() will wake the sleeping thread up and allow execution to continue. Regards, Al > -----Original Message----- > From: jason [mailto:huz...@gm...] > Sent: Thursday, March 24, 2011 10:54 AM > To: tip...@li... > Subject: [tipc-discussion] send_msg() may not be waken up by > tipc_recv_msg() when port congestion happen > > Hi All, > > When I studying tipc-1.7.6 code. I see a potaintial issue that when a user > thread which calls send_msg() and is scheduled out because port congestion, > it may not be wake up any more by tipc_recv_msg(). > > Let's see send_msg related pseudo code: > 01 add self to link.sleep_queue > 02 if port.congest == 1 { > 03 schedule(); > 04 } > > and see tipc_recv_msg related code: > > 01 if link.sleep_queue empty { > 02 port.congest = 0; > 03 wakeup; > 04 } > Say we are SMP. Right after CPU0 runs at tipc_recv_msg related code line 01 > and see link.sleep_queue IS empty, CPU1 execute send_msg related code > from line 01 to 04. In this conditon, send_msg will be schedule out by CPU1 > and > CPU0 loss the opptunity to wake it up. If there is no incoming message any > more, then send_msg thread will not be waken any more. > > If my analysis is right, I think one way to solve whis issue is to swap > tipc_recv_msg related code line 01 and 02, and smp_wmb should also be > placed between them. But real code change is not as easy as a code line > swapping. > > Thanks! > > -- > Yours, > jason > ------------------------------------------------------------------------------ > Enable your software for Intel(R) Active Management Technology to meet > the growing manageability and security demands of your customers. > Businesses are taking advantage of Intel(R) vPro (TM) technology - will your > software be a part of the solution? Download the Intel(R) Manageability > Checker today! http://p.sf.net/sfu/intel-dev2devmar > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion |
From: jason <huz...@gm...> - 2011-03-25 15:35:02
|
Sorry, My mistake. I mistook the link->wating_ports for the thread wait queue. So there are two queues in question. The correct version of execute follow is like this: send_msg():[when meets port congestion] 1) add port into link->waiting_ports 2) prepare_to_wait(add self thread into sk->sk_sleep) 3) if port->congested then call schedule() tipc_recv_msg(): 1) if !list_empty(&l_ptr->waiting_ports) { 1.1) list_del_init(&p_ptr->wait_list); 1.2) port->congested = 0; 1.3) wakeup waiting thread. } Basically, there is no problem about those sequense above. But I still suggest add a smp_wmb between step 1.2 and 1.3 for tipc_recv_msg(). Otherwise, on some SMP & memory reordering architecture, send_msg() dead lock would still occur. 2011/3/25 Stephens, Allan <all...@wi...> > Hi Jason: > > I don't think the problem you have identified can actually occur because > TIPC's link "aliveness" checking ensures that tipc_recv_msg() gets invoked > at regular intervals. (That is, a link endpoint must periodically receive > either payload messages or LINK_PROTOCOL messages from its peer endpoint, or > else the link is declared dead and gets reset.) So even if a sending thread > adds its port to a link's list of waiting ports after tipc_recv_msg() has > marked the link as non-congested and then sleeps, the next invocation of > tipc_recv_msg() will wake the sleeping thread up and allow execution to > continue. > > Regards, > Al > > > -----Original Message----- > > From: jason [mailto:huz...@gm...] > > Sent: Thursday, March 24, 2011 10:54 AM > > To: tip...@li... > > Subject: [tipc-discussion] send_msg() may not be waken up by > > tipc_recv_msg() when port congestion happen > > > > Hi All, > > > > When I studying tipc-1.7.6 code. I see a potaintial issue that when a > user > > thread which calls send_msg() and is scheduled out because port > congestion, > > it may not be wake up any more by tipc_recv_msg(). > > > > Let's see send_msg related pseudo code: > > 01 add self to link.sleep_queue > > 02 if port.congest == 1 { > > 03 schedule(); > > 04 } > > > > and see tipc_recv_msg related code: > > > > 01 if link.sleep_queue empty { > > 02 port.congest = 0; > > 03 wakeup; > > 04 } > > Say we are SMP. Right after CPU0 runs at tipc_recv_msg related code line > 01 > > and see link.sleep_queue IS empty, CPU1 execute send_msg related code > > from line 01 to 04. In this conditon, send_msg will be schedule out by > CPU1 > > and > > CPU0 loss the opptunity to wake it up. If there is no incoming message > any > > more, then send_msg thread will not be waken any more. > > > > If my analysis is right, I think one way to solve whis issue is to swap > > tipc_recv_msg related code line 01 and 02, and smp_wmb should also be > > placed between them. But real code change is not as easy as a code line > > swapping. > > > > Thanks! > > > > -- > > Yours, > > jason > > > ------------------------------------------------------------------------------ > > Enable your software for Intel(R) Active Management Technology to meet > > the growing manageability and security demands of your customers. > > Businesses are taking advantage of Intel(R) vPro (TM) technology - will > your > > software be a part of the solution? Download the Intel(R) Manageability > > Checker today! http://p.sf.net/sfu/intel-dev2devmar > > _______________________________________________ > > tipc-discussion mailing list > > tip...@li... > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > -- Yours, jason |
From: Jon M. <jon...@er...> - 2011-03-24 19:40:47
|
This reminds me that we should probably rename tipc_recv_msg() to tipc_recv_pkt(), and likewise for some other functions. Maybe something for the tipc-2.0 catchup drive. ///jon Jon Maloy M.Sc. EE Researcher Ericsson Canada Broadband and Systems Research 8400 Decarie H4P 2N2, Montreal, Quebec, Canada Phone + 1 514 345-7900 x42056 Mobile + 1 514 591-5578 jon...@er... www.ericsson.com > -----Original Message----- > From: Stephens, Allan [mailto:all...@wi...] > Sent: March-24-11 15:17 > To: jason; tip...@li... > Subject: Re: [tipc-discussion] send_msg() may not be waken up > by tipc_recv_msg() when port congestion happen > > Hi Jason: > > I don't think the problem you have identified can actually > occur because TIPC's link "aliveness" checking ensures that > tipc_recv_msg() gets invoked at regular intervals. (That is, > a link endpoint must periodically receive either payload > messages or LINK_PROTOCOL messages from its peer endpoint, or > else the link is declared dead and gets reset.) So even if a > sending thread adds its port to a link's list of waiting > ports after tipc_recv_msg() has marked the link as > non-congested and then sleeps, the next invocation of > tipc_recv_msg() will wake the sleeping thread up and allow > execution to continue. > > Regards, > Al > > > -----Original Message----- > > From: jason [mailto:huz...@gm...] > > Sent: Thursday, March 24, 2011 10:54 AM > > To: tip...@li... > > Subject: [tipc-discussion] send_msg() may not be waken up by > > tipc_recv_msg() when port congestion happen > > > > Hi All, > > > > When I studying tipc-1.7.6 code. I see a potaintial issue > that when a > > user thread which calls send_msg() and is scheduled out > because port > > congestion, it may not be wake up any more by tipc_recv_msg(). > > > > Let's see send_msg related pseudo code: > > 01 add self to link.sleep_queue > > 02 if port.congest == 1 { > > 03 schedule(); > > 04 } > > > > and see tipc_recv_msg related code: > > > > 01 if link.sleep_queue empty { > > 02 port.congest = 0; > > 03 wakeup; > > 04 } > > Say we are SMP. Right after CPU0 runs at tipc_recv_msg related code > > line 01 and see link.sleep_queue IS empty, CPU1 execute send_msg > > related code from line 01 to 04. In this conditon, send_msg will be > > schedule out by CPU1 and CPU0 loss the opptunity to wake it up. If > > there is no incoming message any more, then send_msg thread > will not > > be waken any more. > > > > If my analysis is right, I think one way to solve whis issue is to > > swap tipc_recv_msg related code line 01 and 02, and smp_wmb should > > also be placed between them. But real code change is not > as easy as a > > code line swapping. > > > > Thanks! > > > > -- > > Yours, > > jason > > > ---------------------------------------------------------------------- > > -------- Enable your software for Intel(R) Active Management > > Technology to meet the growing manageability and security > demands of > > your customers. > > Businesses are taking advantage of Intel(R) vPro (TM) technology - > > will your software be a part of the solution? Download the Intel(R) > > Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar > > _______________________________________________ > > tipc-discussion mailing list > > tip...@li... > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > > -------------------------------------------------------------- > ---------------- > Enable your software for Intel(R) Active Management > Technology to meet the growing manageability and security > demands of your customers. Businesses are taking advantage of > Intel(R) vPro (TM) technology - will your software be a part > of the solution? Download the Intel(R) Manageability Checker > today! http://p.sf.net/sfu/intel-dev2devmar > _______________________________________________ > tipc-discussion mailing list > tip...@li... > https://lists.sourceforge.net/lists/listinfo/tipc-discussion > |
From: Stephens, A. <all...@wi...> - 2011-03-24 19:52:19
|
Jon wrote: > This reminds me that we should probably rename tipc_recv_msg() to > tipc_recv_pkt(), and likewise for some other functions. > Maybe something for the tipc-2.0 catchup drive. > > ///jon Agreed. The only difficulty is that cosmetic changes like this need to be integrated in patches that are specifically created for this purpose, rather than being included as part of patches that make functional changes to TIPC. What I suspect will happen is that once the bulk of the functional changes in TIPC 2.0 are done I'll end up submitting one cosmetic patch (or perhaps a small number of them) that clean up anything of this sort that is still hanging around. -- Al |