tipc-discussion Mailing List for TIPC Cluster Domain Sockets (Page 589)

Cluster wide IPC providing datagram, connection, and bus messaging

Brought to you by: goranm, johnr64, jonmaloy, ldcanh, and 6 others

tipc-discussion — User questions & problem reports, project news, developer discussion

You can subscribe to this list here.

2003	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (6)
2004	Jan (9)	Feb (11)	Mar (22)	Apr (73)	May (78)	Jun (146)	Jul (80)	Aug (27)	Sep (5)	Oct (14)	Nov (18)	Dec (27)
2005	Jan (20)	Feb (30)	Mar (19)	Apr (28)	May (50)	Jun (31)	Jul (32)	Aug (14)	Sep (36)	Oct (43)	Nov (74)	Dec (63)
2006	Jan (34)	Feb (32)	Mar (21)	Apr (76)	May (106)	Jun (72)	Jul (70)	Aug (175)	Sep (130)	Oct (39)	Nov (81)	Dec (43)
2007	Jan (81)	Feb (36)	Mar (20)	Apr (43)	May (54)	Jun (34)	Jul (44)	Aug (55)	Sep (44)	Oct (54)	Nov (43)	Dec (41)
2008	Jan (42)	Feb (84)	Mar (73)	Apr (30)	May (119)	Jun (54)	Jul (54)	Aug (93)	Sep (173)	Oct (130)	Nov (145)	Dec (153)
2009	Jan (59)	Feb (12)	Mar (28)	Apr (18)	May (56)	Jun (9)	Jul (28)	Aug (62)	Sep (16)	Oct (19)	Nov (15)	Dec (17)
2010	Jan (14)	Feb (36)	Mar (37)	Apr (30)	May (33)	Jun (53)	Jul (42)	Aug (50)	Sep (67)	Oct (66)	Nov (69)	Dec (36)
2011	Jan (52)	Feb (45)	Mar (49)	Apr (21)	May (34)	Jun (13)	Jul (19)	Aug (37)	Sep (43)	Oct (10)	Nov (23)	Dec (30)
2012	Jan (42)	Feb (36)	Mar (46)	Apr (25)	May (96)	Jun (146)	Jul (40)	Aug (28)	Sep (61)	Oct (45)	Nov (100)	Dec (53)
2013	Jan (79)	Feb (24)	Mar (134)	Apr (156)	May (118)	Jun (75)	Jul (278)	Aug (145)	Sep (136)	Oct (168)	Nov (137)	Dec (439)
2014	Jan (284)	Feb (158)	Mar (231)	Apr (275)	May (259)	Jun (91)	Jul (222)	Aug (215)	Sep (165)	Oct (166)	Nov (211)	Dec (150)
2015	Jan (164)	Feb (324)	Mar (299)	Apr (214)	May (111)	Jun (109)	Jul (105)	Aug (36)	Sep (58)	Oct (131)	Nov (68)	Dec (30)
2016	Jan (46)	Feb (87)	Mar (135)	Apr (174)	May (132)	Jun (135)	Jul (149)	Aug (125)	Sep (79)	Oct (49)	Nov (95)	Dec (102)
2017	Jan (104)	Feb (75)	Mar (72)	Apr (53)	May (18)	Jun (5)	Jul (14)	Aug (19)	Sep (2)	Oct (13)	Nov (21)	Dec (67)
2018	Jan (56)	Feb (50)	Mar (148)	Apr (41)	May (37)	Jun (34)	Jul (34)	Aug (11)	Sep (52)	Oct (48)	Nov (28)	Dec (46)
2019	Jan (29)	Feb (63)	Mar (95)	Apr (54)	May (14)	Jun (71)	Jul (60)	Aug (49)	Sep (3)	Oct (64)	Nov (115)	Dec (57)
2020	Jan (15)	Feb (9)	Mar (38)	Apr (27)	May (60)	Jun (53)	Jul (35)	Aug (46)	Sep (37)	Oct (64)	Nov (20)	Dec (25)
2021	Jan (20)	Feb (31)	Mar (27)	Apr (23)	May (21)	Jun (30)	Jul (30)	Aug (7)	Sep (18)	Oct	Nov (15)	Dec (4)
2022	Jan (3)	Feb (1)	Mar (10)	Apr	May (2)	Jun (26)	Jul (5)	Aug	Sep (1)	Oct (2)	Nov (9)	Dec (2)
2023	Jan (4)	Feb (4)	Mar (5)	Apr (10)	May (29)	Jun (17)	Jul	Aug	Sep (1)	Oct (1)	Nov (2)	Dec
2024	Jan	Feb (6)	Mar	Apr (1)	May (6)	Jun	Jul (5)	Aug	Sep (3)	Oct	Nov	Dec
2025	Jan	Feb (3)	Mar	Apr	May	Jun	Jul (6)	Aug	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 .. 587 588 589 590 591 .. 595 > >> (Page 589 of 595)

Re: [Tipc-discussion] [BUG]send function cannot send messages whose size are more than 66000 bytes

From: Jon M. <jon...@er...> - 2004-05-19 14:45:51

Hi Hu,
It does fragment/defragment messages, but only up to a limit, depending
on the socket type you are using. If you use message oriented sockets,
i.e., SOCK_SEQPACKET,SOCK_RDM or SOCK_DGRAM, it will
only fragment up to MAX_MSG_SIZE, which is indeed 66000 bytes.

If you use SOCK_STREAM, it will fragment up to a size of 100 Mbyte
per sent "chunk". I must admit that I have still not tested this feature
beyond 66k (this is on my TODO list), so you may quite well run into
bugs here.  Any feedback on this is welcome.

Regards /Jon

Yin, Hu wrote:

>Hi Jon,
>
>I'm a intern of Intel China Software Lab. Now I'm doing TIPC's
>validation job here.
>
>Today I meet a problem. When I write a little program to send/receive
>different sizes message between two programs there will be error( errno:
>22, "Invalid argument") when the size of message is more than 66000
>bytes.
>
>Doesn't TIPC fragment and defragment messages? 
>
>Best regards,
>
>Nick Yin
>
>Intel China Software Lab
>
>The content of this email message solely contains my own personal views,
>and not those of my employer.
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by: SourceForge.net Broadband
>Sign-up now for SourceForge Broadband and get the fastest
>6.0/768 connection for only $19.95/mo for the first 3 months!
>http://ads.osdn.com/?ad_id%62&alloc_ida84&op=click
>_______________________________________________
>TIPC-discussion mailing list
>TIP...@li...
>https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>  
>

Re: [Tipc-discussion] [BUG]send function cannot send messages whose size are more than 66000 bytes

From: Mark H. <ma...@os...> - 2004-05-19 14:29:23

On Wed, 2004-05-19 at 04:04, Yin, Hu wrote:
> Hi Jon,
> 
> I'm a intern of Intel China Software Lab. Now I'm doing TIPC's
> validation job here.
> 
> Today I meet a problem. When I write a little program to send/receive
> different sizes message between two programs there will be error( errno:
> 22, "Invalid argument") when the size of message is more than 66000
> bytes.
> 
> Doesn't TIPC fragment and defragment messages? 

tipc.h has 

#define TIPC_MAX_MSG_SIZE 66000

Messages are fragmented, but at a smaller size (the max_packet size in
the bearer structure) up to a maximum message size.  I believe that
66000 is the largest size of a  single message handled by tipc.

Mark.

-- 
Mark Haverkamp <ma...@os...>

[Tipc-discussion] Yet another version checked in

From: Jon M. <jon...@er...> - 2004-05-19 13:50:52

I think I have succeeded in making the port/transport level lock
handling somewhat easier to understand, without changing the
basic locking policy.

New:

1: Callbacks to socket.c (dispatcher/wakeup dipatcher) are now
    called with and return with the port-lock on.

2: Symmetry: A port or subscription is now always
    locked/unlocked in the same function.

3: No dual-purpose functions to the reference manager;
    ref_lock_acquire/ref_unlock_discard etc are now gone.
    The port lock is still located in the corresponding reference
    entry, though, and is accessed via easily comprehensible
    inline functions; port_lock()/port_unlock() etc.

4: Better comments about what the locks protect, and what
    the lock functions do.

I will continue with a similar work on the node locks in a few days,
but I would welcome some feedback first.

/Jon

[Tipc-discussion] [BUG]send function cannot send messages whose size are more than 66000 bytes

From: Yin, H. <hu...@in...> - 2004-05-19 11:04:23

Hi Jon,

I'm a intern of Intel China Software Lab. Now I'm doing TIPC's
validation job here.

Today I meet a problem. When I write a little program to send/receive
different sizes message between two programs there will be error( errno:
22, "Invalid argument") when the size of message is more than 66000
bytes.

Doesn't TIPC fragment and defragment messages?=20

Best regards,

Nick Yin

Intel China Software Lab

The content of this email message solely contains my own personal views,
and not those of my employer.

[Tipc-discussion] Re: tipc panic with page alloc debug.

From: Jon M. <jon...@er...> - 2004-05-18 17:39:56

Hi Mark,
I haven't tried with 64 processes yet, but I will try to reproduce
and trouble-shoot this problem when I have time. Right now I am
spending some time on making the lock handling at port/socket level
more symmetric and easier to follow. It will have some
performance implications, but it will not have any major impact,
I think.

I must admit that I am not familiar with Bugzilla. Is this the base
for the bug reporting/tracking system we already have for each
project, or is it  something else ? The bug report system has only
been used sporadically, as you may have noticed, but I have no
problems with starting to use it more systematically; - I think
indeed we will have to if TIPC makes it into the kernel.

/Jon

Mark Haverkamp wrote:

>I was running a modified tipc benchmark program that used 64 processes
>instead of 32.  I also had my kernel compiled with page alloc debug
>turned on (memory allocations are unmapped when free to catch bad access
>as soon as possible).  It was part way through the 16K size when the
>panic happened.  It was on the server side.
>
>Any thoughts on using the sourceforge bugzilla to keep track of current
>bugs?
>
>Mark.
>
>
>[root@cl019 root]# 
>Unable to handle kernel paging request at virtual address f1067fe4
> printing eip:
>f8a8617e
>*pde = 00585067
>*pte = 31067163
>Oops: 0000 [#1]
>SMP DEBUG_PAGEALLOC
>CPU:    1
>EIP:    0060:[<f8a8617e>]    Not tainted
>EFLAGS: 00010206   (2.6.6-rc3)
>EIP is at tipc_recv_msg+0x15e/0x880 [tipc]
>eax: 00000000   ebx: f1067f50   ecx: f104a7f8   edx: f105df50
>esi: efb89e18   edi: f4d8fbf8   ebp: f01cda70   esp: f01cda18
>ds: 007b   es: 007b   ss: 0068
>Process server_tipc (pid: 1439, threadinfo=f01cc000 task=f01f3a60)
>Stack: f632fbf8 00000000 00000086 c1620ce0 00000000 00000086 f01cda3c c01338cb
>       00000000 00002ae8 0000abe6 efb89e18 effcbf50 f104a7f8 00000001 00000000
>       f01cda90 00000286 c1620ce0 f8aaf400 effcbf50 c054f970 f01cda80 f8aa4319
>Call Trace:
> [<c01338cb>] kernel_text_address+0x3b/0x50
> [<f8aa4319>] recv_msg+0x39/0x50 [tipc]
> [<c0375e92>] netif_receive_skb+0x172/0x1b0
> [<c0375f54>] process_backlog+0x84/0x120
> [<c0376070>] net_rx_action+0x80/0x120
> [<c0124c38>] __do_softirq+0xb8/0xc0
> [<c0124c75>] do_softirq+0x35/0x40
> [<c0107cf5>] do_IRQ+0x175/0x230
> [<c0375e92>] netif_receive_skb+0x172/0x1b0
> [<c0105ce0>] common_interrupt+0x18/0x20
> [<c0221e39>] __copy_user_zeroing_intel+0x19/0xb0
> [<c0221fc2>] __copy_from_user_ll+0x72/0x80
> [<c0222088>] copy_from_user+0x58/0x80
> [<f8a8530f>] link_send_sections_long+0x30f/0xad0 [tipc]
> [<f8a824de>] link_schedule_port+0xfe/0x1b0 [tipc]
> [<f8a84d39>] link_send_sections_fast+0x559/0x820 [tipc]
> [<c0124c38>] __do_softirq+0xb8/0xc0
> [<f8a96312>] tipc_send+0x92/0x9d0 [tipc]
> [<c03f186f>] schedule+0x37f/0x7a0
> [<c0370bd5>] kfree_skbmem+0x25/0x30
> [<f8aa1946>] recv_msg+0x2b6/0x560 [tipc]
> [<f8aa12f0>] send_packet+0x90/0x180 [tipc]
> [<c011b140>] default_wake_function+0x0/0x20
> [<c036cbde>] sock_sendmsg+0x8e/0xb0
> [<c01193f8>] kernel_map_pages+0x28/0x64
> [<c036c9ba>] sockfd_lookup+0x1a/0x80
> [<c036e101>] sys_sendto+0xe1/0x100
> [<c0128fd2>] del_timer_sync+0x42/0x140
> [<c036d4a9>] sock_poll+0x29/0x30
> [<c01193f8>] kernel_map_pages+0x28/0x64
> [<c036e156>] sys_send+0x36/0x40
> [<c036e9ae>] sys_socketcall+0x12e/0x240
> [<c0105373>] syscall_call+0x7/0xb
>                                   
>Code: 8b 83 94 00 00 00 48 0f 85 f9 00 00 00 8d 73 44 8b 46 0c 8b
> <0>Kernel panic: Fatal exception in interrupt
>In interrupt handler - not syncing
>
>  
>

[Tipc-discussion] tipc panic with page alloc debug.

From: Mark H. <ma...@os...> - 2004-05-18 14:23:45

I was running a modified tipc benchmark program that used 64 processes
instead of 32.  I also had my kernel compiled with page alloc debug
turned on (memory allocations are unmapped when free to catch bad access
as soon as possible).  It was part way through the 16K size when the
panic happened.  It was on the server side.

Any thoughts on using the sourceforge bugzilla to keep track of current
bugs?

Mark.


[root@cl019 root]# 
Unable to handle kernel paging request at virtual address f1067fe4
 printing eip:
f8a8617e
*pde = 00585067
*pte = 31067163
Oops: 0000 [#1]
SMP DEBUG_PAGEALLOC
CPU:    1
EIP:    0060:[<f8a8617e>]    Not tainted
EFLAGS: 00010206   (2.6.6-rc3)
EIP is at tipc_recv_msg+0x15e/0x880 [tipc]
eax: 00000000   ebx: f1067f50   ecx: f104a7f8   edx: f105df50
esi: efb89e18   edi: f4d8fbf8   ebp: f01cda70   esp: f01cda18
ds: 007b   es: 007b   ss: 0068
Process server_tipc (pid: 1439, threadinfo=f01cc000 task=f01f3a60)
Stack: f632fbf8 00000000 00000086 c1620ce0 00000000 00000086 f01cda3c c01338cb
       00000000 00002ae8 0000abe6 efb89e18 effcbf50 f104a7f8 00000001 00000000
       f01cda90 00000286 c1620ce0 f8aaf400 effcbf50 c054f970 f01cda80 f8aa4319
Call Trace:
 [<c01338cb>] kernel_text_address+0x3b/0x50
 [<f8aa4319>] recv_msg+0x39/0x50 [tipc]
 [<c0375e92>] netif_receive_skb+0x172/0x1b0
 [<c0375f54>] process_backlog+0x84/0x120
 [<c0376070>] net_rx_action+0x80/0x120
 [<c0124c38>] __do_softirq+0xb8/0xc0
 [<c0124c75>] do_softirq+0x35/0x40
 [<c0107cf5>] do_IRQ+0x175/0x230
 [<c0375e92>] netif_receive_skb+0x172/0x1b0
 [<c0105ce0>] common_interrupt+0x18/0x20
 [<c0221e39>] __copy_user_zeroing_intel+0x19/0xb0
 [<c0221fc2>] __copy_from_user_ll+0x72/0x80
 [<c0222088>] copy_from_user+0x58/0x80
 [<f8a8530f>] link_send_sections_long+0x30f/0xad0 [tipc]
 [<f8a824de>] link_schedule_port+0xfe/0x1b0 [tipc]
 [<f8a84d39>] link_send_sections_fast+0x559/0x820 [tipc]
 [<c0124c38>] __do_softirq+0xb8/0xc0
 [<f8a96312>] tipc_send+0x92/0x9d0 [tipc]
 [<c03f186f>] schedule+0x37f/0x7a0
 [<c0370bd5>] kfree_skbmem+0x25/0x30
 [<f8aa1946>] recv_msg+0x2b6/0x560 [tipc]
 [<f8aa12f0>] send_packet+0x90/0x180 [tipc]
 [<c011b140>] default_wake_function+0x0/0x20
 [<c036cbde>] sock_sendmsg+0x8e/0xb0
 [<c01193f8>] kernel_map_pages+0x28/0x64
 [<c036c9ba>] sockfd_lookup+0x1a/0x80
 [<c036e101>] sys_sendto+0xe1/0x100
 [<c0128fd2>] del_timer_sync+0x42/0x140
 [<c036d4a9>] sock_poll+0x29/0x30
 [<c01193f8>] kernel_map_pages+0x28/0x64
 [<c036e156>] sys_send+0x36/0x40
 [<c036e9ae>] sys_socketcall+0x12e/0x240
 [<c0105373>] syscall_call+0x7/0xb
                                   
Code: 8b 83 94 00 00 00 48 0f 85 f9 00 00 00 8d 73 44 8b 46 0c 8b
 <0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

-- 
Mark Haverkamp <ma...@os...>

[Tipc-discussion] New version of TIPC checked in

From: Jon M. <jon...@er...> - 2004-05-17 19:53:17

With the following:
1: Notification handler towards net_device. Links will  now go down
    immediately when the carrier on an interface is lost, and act correctly
    on ifconfig commands.
2: Changed the lock handling in port.c according to description in previous
    mail. The only overlap between node locks and port locks now is
    in the link_schedule/wakeup functions. I will consider what more I 
can do
    with this, and other parts of the lock handling.
3: Added a verbal description (in net.c) of the whole locking policy.

/jon

Re: [Tipc-discussion] Re: spinlock deadlock

From: Jon M. <jon...@er...> - 2004-05-13 19:46:45

Just to avoid misunderstandings: I am not working with
the changes I suggested. I am struggling with trying to understand
the usage of netdevice notifications, one of the issues
Stephen H.'s called a showstopper in his review. 

Btw,  does anybody know if NETDEV_DOWN/UP means
that the carrier disappeared/reappeared (I hope so), or does it
simply mean that somebody did "ifconfig ethX down" ?  (I think so). 

Is there any description of how this is supposed to be used ?

/Jon

Jon Maloy wrote:


Hi Mark,
The general principle I had in mind when designing this was to not
keep a node-lock and the port-lock at the same time, but in some 
cases this looks unavoidable, such as in the wakeup-function you 
corrected yesterday. In other cases I have just not been aware of the 
potential problem,  even when it is easy to fix.

The current problem is in fact on of the latter type; as it is
completely
unecessary to call port_send_protocol_msg() with the port lock on.

What we need here is  a small redesign to resolve the whole problem: 

Everywhere  port_send_protocol_msg() is called (in port_timeout(), 
   port_recv_protocol_msg() and  a few other places) we should 
1) build  the message (we rename port_send_protocol_msg() to e.g.
    port_build_protocol_msg() and let it return an sk_buff*  )
2) Update the port's sequence number, when applicable.
3) Release the port lock.
4) Send the buffer through net_route_msg()

There is one call where this is difficult, in port_abort_self((),
since this itself is called with the lock on, but on the other hand
this only sends a message to the own port, not to any link, so no
node lock will be taken.

This should resolve the deadlock problem between port and node
locks once and for all, I hope. The rest is about proper documenation.

Regards /Jon



Mark Haverkamp wrote:


On Wed, 2004-05-12 at 13:25, Mark Haverkamp wrote:

  

On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:

    

Yeah, that's a classical one. I also think your solution is ok; the

pending ports will be awakened at next message reception, so

no harm done.



Thanks /jon



      



I found another one.  I got another hang yesterday after the current

deadlock fix.  I re-added my spinlock debug code and found out that

we're getting a deadlock between the node lock and the tipc_port lock.

It looks like the port timeout handler is running on one CPU and a

recv_msg is running on the other.  I suppose as a workaround, we could

make all the spin lock access in wakeup be conditional, but that will

probably just make the problem just show up somewhere else.  There

should probably be an analysis of code paths and determine how the locks

interact with each other.  I have noticed that there is at least one

place where three locks are required.  This can cause problems like

we've seen when different code paths need multiple locks unless there is

some sort of back off method to insure no deadlocks.  



For instance if a function needs both the node lock and tipc_port lock,

something like this could be done:



- - - - - - 

again:

spinlock(node_lock);



spintrylock(tipc_port_lock)

if failed to acquire then release node_lock and goto again.



- - - - - - - -



Unfortunately, if one of the locks was acquired before the function was

entered, releasing that lock may allow some state to change that the

caller was assuming wouldn't change and that could cause problems too. 

So, I'm not sure that this method could work in the link_wakeup_ports

function.  The node lock is held on entry and if the tipc_port lock

can't be acquired, you probably can't just release the node lock and

wait for it again since that could invalidate the tipc_recv_msg

functions assumptions about what the node lock was protecting.







Anyway, here is the raw trace from the debug:



&node->lock lock timeout

Call Trace:

 [<f8a8543a>] link_send+0xda/0x2a0 [tipc]

 [<f8a92c8e>] net_route_msg+0x41e/0x43d [tipc]

 [<f8a94962>] port_send_proto_msg+0x1a2/0x2a0 [tipc]

 [<f8aa1b10>] k_signal+0x60/0x170 [tipc]

 [<f8a956be>] port_timeout+0xbe/0x200 [tipc]

 [<f8aa1ccc>] timeout_receiver+0xac/0x110 [tipc]

 [<f8aa1a9f>] receive_signal+0x14f/0x160 [tipc]

 [<c011ab51>] scheduler_tick+0x111/0x690

 [<c0124ec3>] tasklet_action+0x73/0xc0

 [<c0124bc8>] __do_softirq+0xb8/0xc0

 [<c0124c05>] do_softirq+0x35/0x40

 [<c011526d>] smp_apic_timer_interrupt+0xdd/0x150

 [<c0105d62>] apic_timer_interrupt+0x1a/0x20

 [<c01119c0>] delay_pit+0x20/0x30

 [<c0221694>] __delay+0x14/0x20

 [<f8a8685f>] link_send_sections_long+0x50f/0xb30 [tipc]

 [<c01152d1>] smp_apic_timer_interrupt+0x141/0x150

 [<f8a8366f>] link_schedule_port+0x13f/0x1e0 [tipc]

 [<f8a86095>] link_send_sections_fast+0x5b5/0x870 [tipc]

 [<f8a97b92>] tipc_send+0x92/0x9d0 [tipc]

 [<c03f14cf>] schedule+0x37f/0x7a0

 [<f8aa3386>] recv_msg+0x2b6/0x560 [tipc]

 [<f8aa2d30>] send_packet+0x90/0x180 [tipc]

 [<c011b0d0>] default_wake_function+0x0/0x20

 [<c036c83e>] sock_sendmsg+0x8e/0xb0

 [<f8a8784e>] tipc_recv_msg+0x47e/0x8c0 [tipc]

 [<c01435ba>] buffered_rmqueue+0xfa/0x220

 [<c036c61a>] sockfd_lookup+0x1a/0x80

 [<c036dd61>] sys_sendto+0xe1/0x100

 [<c0128f62>] del_timer_sync+0x42/0x140

 [<c036d109>] sock_poll+0x29/0x30

 [<c017884b>] do_pollfd+0x5b/0xa0

 [<c036ddb6>] sys_send+0x36/0x40

 [<c036e60e>] sys_socketcall+0x12e/0x240

 [<c0105373>] syscall_call+0x7/0xb

                                   

port->publ lock timeout

Call Trace:

 [<f8a8383a>] link_wakeup_ports+0x12a/0x1d0 [tipc]

 [<f8a87bce>] tipc_recv_msg+0x7fe/0x8c0 [tipc]

 [<c014949d>] __kmalloc+0x19d/0x250

 [<f8aa5d59>] recv_msg+0x39/0x50 [tipc]

 [<c0375af2>] netif_receive_skb+0x172/0x1b0

 [<c0375bb4>] process_backlog+0x84/0x120

 [<c0375cd0>] net_rx_action+0x80/0x120

 [<c0124bc8>] __do_softirq+0xb8/0xc0

 [<c0124c05>] do_softirq+0x35/0x40

 [<c0107ce5>] do_IRQ+0x175/0x230

 [<c0105ea5>] nmi_stack_correct+0x1e/0x2e

 [<c0105ce0>] common_interrupt+0x18/0x20

Re: [Tipc-discussion] Re: spinlock deadlock

From: Jon M. <jon...@er...> - 2004-05-13 18:10:12

See below.
/jon

Daniel McNeil wrote:


On Thu, 2004-05-13 at 08:10, Mark Haverkamp wrote:

  

On Wed, 2004-05-12 at 13:25, Mark Haverkamp wrote:

    

On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:

      

Yeah, that's a classical one. I also think your solution is ok; the

pending ports will be awakened at next message reception, so

no harm done.



Thanks /jon



        

I found another one.  I got another hang yesterday after the current

deadlock fix.  I re-added my spinlock debug code and found out that

we're getting a deadlock between the node lock and the tipc_port lock.

It looks like the port timeout handler is running on one CPU and a

recv_msg is running on the other.  I suppose as a workaround, we could

make all the spin lock access in wakeup be conditional, but that will

probably just make the problem just show up somewhere else.  There

should probably be an analysis of code paths and determine how the locks

interact with each other.  

    



I agree.  The locking hierarchy should be documented.  Even if this

is just comments a source file, it needs to be done.  It should

also include exactly what the lock is protecting.

Agree completely. I wil try to do my part.








  

I have noticed that there is at least one

place where three locks are required.  This can cause problems like

we've seen when different code paths need multiple locks unless there is

some sort of back off method to insure no deadlocks.  



    



We also need to analyze if the multiple locks are actually giving us

any performance and/or parallelism benefits.  If we have to take

multiple locks in a common path, that might be causing worse performance

and more deadlock potential.

We could experiment on that. The one-big-lock approach that we had
eariler
was heavily critisized, and caused a lot of problems, but so far
measurements
suggest it was mor efficient when running the benchmark between two
nodes.

When we start to run real systems I am pretty convinced that the 
finer granularity will pay off, as we will have thousands of ports and 
dozens of nodes being accessed in parallel.

In the previous version we had seven locks/unlocks per message
transmitted:

Send side: 1x  big_lock/unlock when message was created and sent, 2x
buffer locks
                  when buffer was allocated and released. 
Receive side: Ditto + 1x  grab/release of big_lock when message is read
to
                  user space.

I don't think it is possible to make it much better.


In the current version we have ten:

Send side:  1x net_lock  (read) + 1x node lock + 2 x buffer lock. (No
tipc_port lock!)
Receive side: 1 net_read_lock + 1x node lock + 1x tipc_port_lock + 1x
tipc_port_lock
                        when message is read (in "advance_queue()" + 2x
buffer lock).

I don't think an increase from seven to ten lock accesses adds any
significant 
overhead to the data-phase path, while the advantages are obvious.

The real problem is not performance, but potential deadlock situations
in
other parts of the code. Those can only be solved by as far as possible 
avoiding to keep the different locks simultaneously, as I suggested in
my 
previous mail. There may be one two cases more I have missed, but 
if we find any, this is the solution we  should look for.

Regards /Jon










Daniel

Re: [Tipc-discussion] Re: spinlock deadlock

From: Jon M. <jon...@er...> - 2004-05-13 17:27:47

Hi Mark,
The general principle I had in mind when designing this was to not
keep a node-lock and the port-lock at the same time, but in some 
cases this looks unavoidable, such as in the wakeup-function you 
corrected yesterday. In other cases I have just not been aware of the 
potential problem,  even when it is easy to fix.

The current problem is in fact on of the latter type; as it is
completely
unecessary to call port_send_protocol_msg() with the port lock on.

What we need here is  a small redesign to resolve the whole problem: 

Everywhere  port_send_protocol_msg() is called (in port_timeout(), 
   port_recv_protocol_msg() and  a few other places) we should 
1) build  the message (we rename port_send_protocol_msg() to e.g.
    port_build_protocol_msg() and let it return an sk_buff*  )
2) Update the port's sequence number, when applicable.
3) Release the port lock.
4) Send the buffer through net_route_msg()

There is one call where this is difficult, in port_abort_self((),
since this itself is called with the lock on, but on the other hand
this only sends a message to the own port, not to any link, so no
node lock will be taken.

This should resolve the deadlock problem between port and node
locks once and for all, I hope. The rest is about proper documenation.

Regards /Jon



Mark Haverkamp wrote:


On Wed, 2004-05-12 at 13:25, Mark Haverkamp wrote:

  

On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:

    

Yeah, that's a classical one. I also think your solution is ok; the

pending ports will be awakened at next message reception, so

no harm done.



Thanks /jon



      



I found another one.  I got another hang yesterday after the current

deadlock fix.  I re-added my spinlock debug code and found out that

we're getting a deadlock between the node lock and the tipc_port lock.

It looks like the port timeout handler is running on one CPU and a

recv_msg is running on the other.  I suppose as a workaround, we could

make all the spin lock access in wakeup be conditional, but that will

probably just make the problem just show up somewhere else.  There

should probably be an analysis of code paths and determine how the locks

interact with each other.  I have noticed that there is at least one

place where three locks are required.  This can cause problems like

we've seen when different code paths need multiple locks unless there is

some sort of back off method to insure no deadlocks.  



For instance if a function needs both the node lock and tipc_port lock,

something like this could be done:



- - - - - - 

again:

spinlock(node_lock);



spintrylock(tipc_port_lock)

if failed to acquire then release node_lock and goto again.



- - - - - - - -



Unfortunately, if one of the locks was acquired before the function was

entered, releasing that lock may allow some state to change that the

caller was assuming wouldn't change and that could cause problems too. 

So, I'm not sure that this method could work in the link_wakeup_ports

function.  The node lock is held on entry and if the tipc_port lock

can't be acquired, you probably can't just release the node lock and

wait for it again since that could invalidate the tipc_recv_msg

functions assumptions about what the node lock was protecting.







Anyway, here is the raw trace from the debug:



&node->lock lock timeout

Call Trace:

 [<f8a8543a>] link_send+0xda/0x2a0 [tipc]

 [<f8a92c8e>] net_route_msg+0x41e/0x43d [tipc]

 [<f8a94962>] port_send_proto_msg+0x1a2/0x2a0 [tipc]

 [<f8aa1b10>] k_signal+0x60/0x170 [tipc]

 [<f8a956be>] port_timeout+0xbe/0x200 [tipc]

 [<f8aa1ccc>] timeout_receiver+0xac/0x110 [tipc]

 [<f8aa1a9f>] receive_signal+0x14f/0x160 [tipc]

 [<c011ab51>] scheduler_tick+0x111/0x690

 [<c0124ec3>] tasklet_action+0x73/0xc0

 [<c0124bc8>] __do_softirq+0xb8/0xc0

 [<c0124c05>] do_softirq+0x35/0x40

 [<c011526d>] smp_apic_timer_interrupt+0xdd/0x150

 [<c0105d62>] apic_timer_interrupt+0x1a/0x20

 [<c01119c0>] delay_pit+0x20/0x30

 [<c0221694>] __delay+0x14/0x20

 [<f8a8685f>] link_send_sections_long+0x50f/0xb30 [tipc]

 [<c01152d1>] smp_apic_timer_interrupt+0x141/0x150

 [<f8a8366f>] link_schedule_port+0x13f/0x1e0 [tipc]

 [<f8a86095>] link_send_sections_fast+0x5b5/0x870 [tipc]

 [<f8a97b92>] tipc_send+0x92/0x9d0 [tipc]

 [<c03f14cf>] schedule+0x37f/0x7a0

 [<f8aa3386>] recv_msg+0x2b6/0x560 [tipc]

 [<f8aa2d30>] send_packet+0x90/0x180 [tipc]

 [<c011b0d0>] default_wake_function+0x0/0x20

 [<c036c83e>] sock_sendmsg+0x8e/0xb0

 [<f8a8784e>] tipc_recv_msg+0x47e/0x8c0 [tipc]

 [<c01435ba>] buffered_rmqueue+0xfa/0x220

 [<c036c61a>] sockfd_lookup+0x1a/0x80

 [<c036dd61>] sys_sendto+0xe1/0x100

 [<c0128f62>] del_timer_sync+0x42/0x140

 [<c036d109>] sock_poll+0x29/0x30

 [<c017884b>] do_pollfd+0x5b/0xa0

 [<c036ddb6>] sys_send+0x36/0x40

 [<c036e60e>] sys_socketcall+0x12e/0x240

 [<c0105373>] syscall_call+0x7/0xb

                                   

port->publ lock timeout

Call Trace:

 [<f8a8383a>] link_wakeup_ports+0x12a/0x1d0 [tipc]

 [<f8a87bce>] tipc_recv_msg+0x7fe/0x8c0 [tipc]

 [<c014949d>] __kmalloc+0x19d/0x250

 [<f8aa5d59>] recv_msg+0x39/0x50 [tipc]

 [<c0375af2>] netif_receive_skb+0x172/0x1b0

 [<c0375bb4>] process_backlog+0x84/0x120

 [<c0375cd0>] net_rx_action+0x80/0x120

 [<c0124bc8>] __do_softirq+0xb8/0xc0

 [<c0124c05>] do_softirq+0x35/0x40

 [<c0107ce5>] do_IRQ+0x175/0x230

 [<c0105ea5>] nmi_stack_correct+0x1e/0x2e

 [<c0105ce0>] common_interrupt+0x18/0x20

Re: [Tipc-discussion] Re: spinlock deadlock

From: Daniel M. <da...@os...> - 2004-05-13 16:12:44

On Thu, 2004-05-13 at 08:10, Mark Haverkamp wrote:
> On Wed, 2004-05-12 at 13:25, Mark Haverkamp wrote:
> > On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:
> > > Yeah, that's a classical one. I also think your solution is ok; the
> > > pending ports will be awakened at next message reception, so
> > > no harm done.
> > > 
> > > Thanks /jon
> > > 
> 
> I found another one.  I got another hang yesterday after the current
> deadlock fix.  I re-added my spinlock debug code and found out that
> we're getting a deadlock between the node lock and the tipc_port lock.
> It looks like the port timeout handler is running on one CPU and a
> recv_msg is running on the other.  I suppose as a workaround, we could
> make all the spin lock access in wakeup be conditional, but that will
> probably just make the problem just show up somewhere else.  There
> should probably be an analysis of code paths and determine how the locks
> interact with each other.  

I agree.  The locking hierarchy should be documented.  Even if this
is just comments a source file, it needs to be done.  It should
also include exactly what the lock is protecting.


> I have noticed that there is at least one
> place where three locks are required.  This can cause problems like
> we've seen when different code paths need multiple locks unless there is
> some sort of back off method to insure no deadlocks.  
> 

We also need to analyze if the multiple locks are actually giving us
any performance and/or parallelism benefits.  If we have to take
multiple locks in a common path, that might be causing worse performance
and more deadlock potential.

Daniel

Re: [Tipc-discussion] Re: spinlock deadlock

From: Mark H. <ma...@os...> - 2004-05-13 15:11:09

On Wed, 2004-05-12 at 13:25, Mark Haverkamp wrote:
> On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:
> > Yeah, that's a classical one. I also think your solution is ok; the
> > pending ports will be awakened at next message reception, so
> > no harm done.
> > 
> > Thanks /jon
> > 

I found another one.  I got another hang yesterday after the current
deadlock fix.  I re-added my spinlock debug code and found out that
we're getting a deadlock between the node lock and the tipc_port lock.
It looks like the port timeout handler is running on one CPU and a
recv_msg is running on the other.  I suppose as a workaround, we could
make all the spin lock access in wakeup be conditional, but that will
probably just make the problem just show up somewhere else.  There
should probably be an analysis of code paths and determine how the locks
interact with each other.  I have noticed that there is at least one
place where three locks are required.  This can cause problems like
we've seen when different code paths need multiple locks unless there is
some sort of back off method to insure no deadlocks.  

For instance if a function needs both the node lock and tipc_port lock,
something like this could be done:

- - - - - - 
again:
spinlock(node_lock);

spintrylock(tipc_port_lock)
if failed to acquire then release node_lock and goto again.

- - - - - - - -

Unfortunately, if one of the locks was acquired before the function was
entered, releasing that lock may allow some state to change that the
caller was assuming wouldn't change and that could cause problems too. 
So, I'm not sure that this method could work in the link_wakeup_ports
function.  The node lock is held on entry and if the tipc_port lock
can't be acquired, you probably can't just release the node lock and
wait for it again since that could invalidate the tipc_recv_msg
functions assumptions about what the node lock was protecting.

Anyway, here is the raw trace from the debug:

&node->lock lock timeout
Call Trace:
 [<f8a8543a>] link_send+0xda/0x2a0 [tipc]
 [<f8a92c8e>] net_route_msg+0x41e/0x43d [tipc]
 [<f8a94962>] port_send_proto_msg+0x1a2/0x2a0 [tipc]
 [<f8aa1b10>] k_signal+0x60/0x170 [tipc]
 [<f8a956be>] port_timeout+0xbe/0x200 [tipc]
 [<f8aa1ccc>] timeout_receiver+0xac/0x110 [tipc]
 [<f8aa1a9f>] receive_signal+0x14f/0x160 [tipc]
 [<c011ab51>] scheduler_tick+0x111/0x690
 [<c0124ec3>] tasklet_action+0x73/0xc0
 [<c0124bc8>] __do_softirq+0xb8/0xc0
 [<c0124c05>] do_softirq+0x35/0x40
 [<c011526d>] smp_apic_timer_interrupt+0xdd/0x150
 [<c0105d62>] apic_timer_interrupt+0x1a/0x20
 [<c01119c0>] delay_pit+0x20/0x30
 [<c0221694>] __delay+0x14/0x20
 [<f8a8685f>] link_send_sections_long+0x50f/0xb30 [tipc]
 [<c01152d1>] smp_apic_timer_interrupt+0x141/0x150
 [<f8a8366f>] link_schedule_port+0x13f/0x1e0 [tipc]
 [<f8a86095>] link_send_sections_fast+0x5b5/0x870 [tipc]
 [<f8a97b92>] tipc_send+0x92/0x9d0 [tipc]
 [<c03f14cf>] schedule+0x37f/0x7a0
 [<f8aa3386>] recv_msg+0x2b6/0x560 [tipc]
 [<f8aa2d30>] send_packet+0x90/0x180 [tipc]
 [<c011b0d0>] default_wake_function+0x0/0x20
 [<c036c83e>] sock_sendmsg+0x8e/0xb0
 [<f8a8784e>] tipc_recv_msg+0x47e/0x8c0 [tipc]
 [<c01435ba>] buffered_rmqueue+0xfa/0x220
 [<c036c61a>] sockfd_lookup+0x1a/0x80
 [<c036dd61>] sys_sendto+0xe1/0x100
 [<c0128f62>] del_timer_sync+0x42/0x140
 [<c036d109>] sock_poll+0x29/0x30
 [<c017884b>] do_pollfd+0x5b/0xa0
 [<c036ddb6>] sys_send+0x36/0x40
 [<c036e60e>] sys_socketcall+0x12e/0x240
 [<c0105373>] syscall_call+0x7/0xb

port->publ lock timeout
Call Trace:
 [<f8a8383a>] link_wakeup_ports+0x12a/0x1d0 [tipc]
 [<f8a87bce>] tipc_recv_msg+0x7fe/0x8c0 [tipc]
 [<c014949d>] __kmalloc+0x19d/0x250
 [<f8aa5d59>] recv_msg+0x39/0x50 [tipc]
 [<c0375af2>] netif_receive_skb+0x172/0x1b0
 [<c0375bb4>] process_backlog+0x84/0x120
 [<c0375cd0>] net_rx_action+0x80/0x120
 [<c0124bc8>] __do_softirq+0xb8/0xc0
 [<c0124c05>] do_softirq+0x35/0x40
 [<c0107ce5>] do_IRQ+0x175/0x230
 [<c0105ea5>] nmi_stack_correct+0x1e/0x2e
 [<c0105ce0>] common_interrupt+0x18/0x20

-- 
Mark Haverkamp <ma...@os...>

[Tipc-discussion] Re: spinlock deadlock

From: Jon M. <jon...@er...> - 2004-05-12 21:11:44

PS: There were more comments further down in my response,  just
in case you maye have missed them.

/jon

Mark Haverkamp wrote:


On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:

  

Yeah, that's a classical one. I also think your solution is ok; the

pending ports will be awakened at next message reception, so

no harm done.



Thanks /jon





    





OK, here is what I plan to check in.  Take a look. I fixed a few other

things that seemed wrong. See comments by each diff.



Mark.



p.s. here is another problem I've been seeing:

If I do management port access during congestion, the I/O never

completes even when the congestion is over.  



----------------------------------------

cvs diff -u media.c reg.c link.c



--------------------------------------------------

I think that this should be an unlock since it is locked a few lines
above.

Index: media.c

===================================================================

RCS file: /cvsroot/tipc/source/unstable/net/tipc/media.c,v

retrieving revision 1.13

diff -u -r1.13 media.c

--- media.c	6 May 2004 15:35:31 -0000	1.13

+++ media.c	12 May 2004 20:10:30 -0000

@@ -221,7 +221,7 @@

 		bearer_schedule_unlocked(this, link);

 		res = 0;

 	}

-        spin_lock_bh(&this->publ.lock);

+        spin_unlock_bh(&this->publ.lock);

 	return res;

 }

 

---------------------------------------------------

Daniel found this one.  ref_lock_deref can grab an invalid object
pointer

if the lock is released too soon.  This waits until the ref number is 

changed so it won't match in ref_lock_deref after it gets the lock.

Index: reg.c

===================================================================

RCS file: /cvsroot/tipc/source/unstable/net/tipc/reg.c,v

retrieving revision 1.8

diff -u -r1.8 reg.c

--- reg.c	5 May 2004 18:39:37 -0000	1.8

+++ reg.c	12 May 2004 20:10:31 -0000

@@ -167,7 +167,6 @@

 	assert(entry->object != 0);

 	assert(entry->data.reference == ref_nb);

 	entry->object = 0;

-        spin_unlock_bh(&entry->lock);

 	if (ref_table.first_free == 0) 

 		ref_table.first_free = index;

 	else 

@@ -175,6 +174,7 @@

 
ref_table.entries[ref_table.last_free].data.next_plus_upper |= index;

 	ref_table.last_free = index;

 	entry->data.next_plus_upper = (ref_nb & ~index_mask) +
index_mask + 1;

+        spin_unlock_bh(&entry->lock);

         write_unlock_bh(&ref_lock);

 }





------------------------------------------------ 

link_schedule_port took the global port_lock/port lock in a different
order

than everywhere else.



Added the spin_trylock_bh in link_wakeup_ports.



Added an unlock in link_recv_fragment on an error exit.



Fixed a couple places where cheking the this pointer looked wrong.



Index: link.c

===================================================================

RCS file: /cvsroot/tipc/source/unstable/net/tipc/link.c,v

retrieving revision 1.24

diff -u -r1.24 link.c

--- link.c	7 May 2004 23:16:03 -0000	1.24

+++ link.c	12 May 2004 20:10:32 -0000

@@ -440,10 +440,14 @@

 int

 link_schedule_port(struct link *this, uint origport,uint sz)

 {

-	struct port *port = port_lock_deref(origport);

-	if (!port)

-		return TIPC_CONGESTION;

+	struct port *port;

+

         spin_lock_bh(&port_lock);

+	port = port_lock_deref(origport);

+	if (!port) {

+		spin_unlock_bh(&port_lock);

+		return TIPC_CONGESTION;

+	}

 	if (!port->wakeup)

                 goto exit;

 	if (!list_empty(&port->wait_list))

@@ -453,8 +457,8 @@

         port->waiting_pkts = 1 + sz/link_max_pkt(this);

 	list_add_tail(&port->wait_list, &this->waiting_ports);

  exit:

-        spin_unlock_bh(&port_lock);

         spin_unlock_bh(port->publ.lock);

+        spin_unlock_bh(&port_lock);

 	return TIPC_CONGESTION;

 }

 

@@ -467,7 +471,8 @@

                 win = 100000;

         if (win <= 0)

                 return;

-        spin_lock_bh(&port_lock);

+        if (!spin_trylock_bh(&port_lock)) 

+		return;

 	if (link_congested(this))

 		goto exit;

 	list_for_each_entry_safe(port, tp, &this->waiting_ports,
wait_list) {

@@ -2365,6 +2370,7 @@

 		if (msg_size(imsg) > TIPC_MAX_MSG_SIZE + LONG_H_SIZE){

                         msg_print(CONS,fragm,"<REC<Oversized: ");

                         buf_discard(fbuf);

+			spin_unlock_bh(&this->owner->lock);

                         return;

                 }

 		buf = buf_acquire(msg_size(imsg));

@@ -2623,7 +2629,7 @@

 	struct link *this;

         read_lock_bh(&net_lock);

         this = link_find_by_name(name);

-	if (!this){

+	if (this){

                 this->exp_msg_count = LINK_BLOCKED;

                 spin_unlock_bh(&this->owner->lock);

                 res = TIPC_OK;

@@ -2640,7 +2646,7 @@

 	struct link *this;

         read_lock_bh(&net_lock);

         this = link_find_by_name(name);

-	if (!this){

+	if (this){

                 this->exp_msg_count = 0;

                 spin_unlock_bh(&this->owner->lock);

                 res = TIPC_OK;

[Tipc-discussion] Re: spinlock deadlock

From: Jon M. <jon...@er...> - 2004-05-12 21:09:49

Hi,
All your corrections look ok. 
The (!this) bug was a little embarassing...

/jon

Mark Haverkamp wrote:


On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:

  

Yeah, that's a classical one. I also think your solution is ok; the

pending ports will be awakened at next message reception, so

no harm done.



Thanks /jon





    





OK, here is what I plan to check in.  Take a look. I fixed a few other

things that seemed wrong. See comments by each diff.



Mark.



p.s. here is another problem I've been seeing:

If I do management port access during congestion, the I/O never

completes even when the congestion is over.  

This may be related to message priorities. When the internal manager
tries
to respond to a message during congestion he can not  be blocked, as we 
can do with sockets/processes, so I set it's priority to
TIPC_NON_REJECTABLE. 
Is it possible that tipc_send2name() etc still returns TIPC_CONGESTION
in 
such cases ? This should not happen, but if the importance priority by
some
reason does not make it into the message, or is not handled properly
along the
data path, who knows? 
Hint: Check the return values of tipc_send2XX() in mng_respond(), that
may
give a clue.







----------------------------------------

cvs diff -u media.c reg.c link.c



--------------------------------------------------

I think that this should be an unlock since it is locked a few lines
above.

Index: media.c

===================================================================

RCS file: /cvsroot/tipc/source/unstable/net/tipc/media.c,v

retrieving revision 1.13

diff -u -r1.13 media.c

--- media.c	6 May 2004 15:35:31 -0000	1.13

+++ media.c	12 May 2004 20:10:30 -0000

@@ -221,7 +221,7 @@

 		bearer_schedule_unlocked(this, link);

 		res = 0;

 	}

-        spin_lock_bh(&this->publ.lock);

+        spin_unlock_bh(&this->publ.lock);

 	return res;

 }

 

Definitely a bug.




---------------------------------------------------

Daniel found this one.  ref_lock_deref can grab an invalid object
pointer

if the lock is released too soon.  This waits until the ref number is 

changed so it won't match in ref_lock_deref after it gets the lock.

Index: reg.c

===================================================================

RCS file: /cvsroot/tipc/source/unstable/net/tipc/reg.c,v

retrieving revision 1.8

diff -u -r1.8 reg.c

--- reg.c	5 May 2004 18:39:37 -0000	1.8

+++ reg.c	12 May 2004 20:10:31 -0000

@@ -167,7 +167,6 @@

 	assert(entry->object != 0);

 	assert(entry->data.reference == ref_nb);

 	entry->object = 0;

-        spin_unlock_bh(&entry->lock);

 	if (ref_table.first_free == 0) 

 		ref_table.first_free = index;

 	else 

@@ -175,6 +174,7 @@

 
ref_table.entries[ref_table.last_free].data.next_plus_upper |= index;

 	ref_table.last_free = index;

 	entry->data.next_plus_upper = (ref_nb & ~index_mask) +
index_mask + 1;

+        spin_unlock_bh(&entry->lock);

         write_unlock_bh(&ref_lock);

 }



  

Also a bug.




------------------------------------------------ 

link_schedule_port took the global port_lock/port lock in a different
order

than everywhere else.



Added the spin_trylock_bh in link_wakeup_ports.



Added an unlock in link_recv_fragment on an error exit.



Fixed a couple places where cheking the this pointer looked wrong.






Index: link.c

===================================================================

RCS file: /cvsroot/tipc/source/unstable/net/tipc/link.c,v

retrieving revision 1.24

diff -u -r1.24 link.c

--- link.c	7 May 2004 23:16:03 -0000	1.24

+++ link.c	12 May 2004 20:10:32 -0000

@@ -440,10 +440,14 @@

 int

 link_schedule_port(struct link *this, uint origport,uint sz)

 {

-	struct port *port = port_lock_deref(origport);

-	if (!port)

-		return TIPC_CONGESTION;

+	struct port *port;

+

         spin_lock_bh(&port_lock);

+	port = port_lock_deref(origport);

+	if (!port) {

+		spin_unlock_bh(&port_lock);

+		return TIPC_CONGESTION;

+	}

 	if (!port->wakeup)

                 goto exit;

 	if (!list_empty(&port->wait_list))

@@ -453,8 +457,8 @@

         port->waiting_pkts = 1 + sz/link_max_pkt(this);

 	list_add_tail(&port->wait_list, &this->waiting_ports);

  exit:

-        spin_unlock_bh(&port_lock);

         spin_unlock_bh(port->publ.lock);

+        spin_unlock_bh(&port_lock);

 	return TIPC_CONGESTION;

 }

 

@@ -467,7 +471,8 @@

                 win = 100000;

         if (win <= 0)

                 return;

-        spin_lock_bh(&port_lock);

+        if (!spin_trylock_bh(&port_lock)) 

+		return;

 	if (link_congested(this))

 		goto exit;

 	list_for_each_entry_safe(port, tp, &this->waiting_ports,
wait_list) {

@@ -2365,6 +2370,7 @@

 		if (msg_size(imsg) > TIPC_MAX_MSG_SIZE + LONG_H_SIZE){

                         msg_print(CONS,fragm,"<REC<Oversized: ");

                         buf_discard(fbuf);

+			spin_unlock_bh(&this->owner->lock);

                         return;

                 }

 		buf = buf_acquire(msg_size(imsg));

@@ -2623,7 +2629,7 @@

 	struct link *this;

         read_lock_bh(&net_lock);

         this = link_find_by_name(name);

-	if (!this){

+	if (this){

                 this->exp_msg_count = LINK_BLOCKED;

                 spin_unlock_bh(&this->owner->lock);

                 res = TIPC_OK;

@@ -2640,7 +2646,7 @@

 	struct link *this;

         read_lock_bh(&net_lock);

         this = link_find_by_name(name);

-	if (!this){

+	if (this){

                 this->exp_msg_count = 0;

                 spin_unlock_bh(&this->owner->lock);

                 res = TIPC_OK;

[Tipc-discussion] Re: spinlock deadlock

From: Mark H. <ma...@os...> - 2004-05-12 20:26:01

On Wed, 2004-05-12 at 12:09, Jon Maloy wrote:
> Yeah, that's a classical one. I also think your solution is ok; the
> pending ports will be awakened at next message reception, so
> no harm done.
> 
> Thanks /jon
> 
> 


OK, here is what I plan to check in.  Take a look. I fixed a few other
things that seemed wrong. See comments by each diff.

Mark.

p.s. here is another problem I've been seeing:
If I do management port access during congestion, the I/O never
completes even when the congestion is over.  

----------------------------------------
cvs diff -u media.c reg.c link.c

--------------------------------------------------
I think that this should be an unlock since it is locked a few lines above.
Index: media.c
===================================================================
RCS file: /cvsroot/tipc/source/unstable/net/tipc/media.c,v
retrieving revision 1.13
diff -u -r1.13 media.c
--- media.c	6 May 2004 15:35:31 -0000	1.13
+++ media.c	12 May 2004 20:10:30 -0000
@@ -221,7 +221,7 @@
 		bearer_schedule_unlocked(this, link);
 		res = 0;
 	}
-        spin_lock_bh(&this->publ.lock);
+        spin_unlock_bh(&this->publ.lock);
 	return res;
 }
 
---------------------------------------------------
Daniel found this one.  ref_lock_deref can grab an invalid object pointer
if the lock is released too soon.  This waits until the ref number is 
changed so it won't match in ref_lock_deref after it gets the lock.
Index: reg.c
===================================================================
RCS file: /cvsroot/tipc/source/unstable/net/tipc/reg.c,v
retrieving revision 1.8
diff -u -r1.8 reg.c
--- reg.c	5 May 2004 18:39:37 -0000	1.8
+++ reg.c	12 May 2004 20:10:31 -0000
@@ -167,7 +167,6 @@
 	assert(entry->object != 0);
 	assert(entry->data.reference == ref_nb);
 	entry->object = 0;
-        spin_unlock_bh(&entry->lock);
 	if (ref_table.first_free == 0) 
 		ref_table.first_free = index;
 	else 
@@ -175,6 +174,7 @@
 		ref_table.entries[ref_table.last_free].data.next_plus_upper |= index;
 	ref_table.last_free = index;
 	entry->data.next_plus_upper = (ref_nb & ~index_mask) + index_mask + 1;
+        spin_unlock_bh(&entry->lock);
         write_unlock_bh(&ref_lock);
 }


------------------------------------------------ 
link_schedule_port took the global port_lock/port lock in a different order
than everywhere else.

Added the spin_trylock_bh in link_wakeup_ports.

Added an unlock in link_recv_fragment on an error exit.

Fixed a couple places where cheking the this pointer looked wrong.

Index: link.c
===================================================================
RCS file: /cvsroot/tipc/source/unstable/net/tipc/link.c,v
retrieving revision 1.24
diff -u -r1.24 link.c
--- link.c	7 May 2004 23:16:03 -0000	1.24
+++ link.c	12 May 2004 20:10:32 -0000
@@ -440,10 +440,14 @@
 int
 link_schedule_port(struct link *this, uint origport,uint sz)
 {
-	struct port *port = port_lock_deref(origport);
-	if (!port)
-		return TIPC_CONGESTION;
+	struct port *port;
+
         spin_lock_bh(&port_lock);
+	port = port_lock_deref(origport);
+	if (!port) {
+		spin_unlock_bh(&port_lock);
+		return TIPC_CONGESTION;
+	}
 	if (!port->wakeup)
                 goto exit;
 	if (!list_empty(&port->wait_list))
@@ -453,8 +457,8 @@
         port->waiting_pkts = 1 + sz/link_max_pkt(this);
 	list_add_tail(&port->wait_list, &this->waiting_ports);
  exit:
-        spin_unlock_bh(&port_lock);
         spin_unlock_bh(port->publ.lock);
+        spin_unlock_bh(&port_lock);
 	return TIPC_CONGESTION;
 }
 
@@ -467,7 +471,8 @@
                 win = 100000;
         if (win <= 0)
                 return;
-        spin_lock_bh(&port_lock);
+        if (!spin_trylock_bh(&port_lock)) 
+		return;
 	if (link_congested(this))
 		goto exit;
 	list_for_each_entry_safe(port, tp, &this->waiting_ports, wait_list) {
@@ -2365,6 +2370,7 @@
 		if (msg_size(imsg) > TIPC_MAX_MSG_SIZE + LONG_H_SIZE){
                         msg_print(CONS,fragm,"<REC<Oversized: ");
                         buf_discard(fbuf);
+			spin_unlock_bh(&this->owner->lock);
                         return;
                 }
 		buf = buf_acquire(msg_size(imsg));
@@ -2623,7 +2629,7 @@
 	struct link *this;
         read_lock_bh(&net_lock);
         this = link_find_by_name(name);
-	if (!this){
+	if (this){
                 this->exp_msg_count = LINK_BLOCKED;
                 spin_unlock_bh(&this->owner->lock);
                 res = TIPC_OK;
@@ -2640,7 +2646,7 @@
 	struct link *this;
         read_lock_bh(&net_lock);
         this = link_find_by_name(name);
-	if (!this){
+	if (this){
                 this->exp_msg_count = 0;
                 spin_unlock_bh(&this->owner->lock);
                 res = TIPC_OK;

-- 
Mark Haverkamp <ma...@os...>

[Tipc-discussion] Re: spinlock deadlock

From: Jon M. <jon...@er...> - 2004-05-12 19:10:00

Yeah, that's a classical one. I also think your solution is ok; the
pending ports will be awakened at next message reception, so
no harm done.

Thanks /jon



Mark Haverkamp wrote:

>Here is what looks to be happening with the spin lock deadlock.  I
>replaced all the spin_lock_bh calls with a wrapper that tries to get the
>lock for a while then prints out a debug message if it can't get the
>lock.
>
>As an experiment, I changed the spin_lock_bh in link_wakeup_ports to a
>trylock and exited if it couldn't get the lock.  I am now not able to
>get the deadlock.
>
>
>CPU 0:
>release -- 
>   tipc_delete_port (get port lock) --
>      port_abort_peer --
>         port_send_proto_msg --
>            net_route_msg --
>               link_send (get node lock) -- (hung spinning)
>
>CPU 1:
>common_interrupt --
>   do_softirq --
>      net_rx_action --
>         netif_receive_skb --
>            recv_msg (tipc eth) --
>               tipc_recv_msg (get node lock) --
>                  link_wakeup_ports (get port lock) -- (hung spinning)
>
>Stack dumps:
>
>port lock timeout
>Call Trace:
> [<f8a837ab>] link_wakeup_ports+0x9b/0x230 [tipc]
> [<f8a87c2e>] tipc_recv_msg+0x7fe/0x8c0 [tipc]
> [<c014949d>] __kmalloc+0x19d/0x250
> [<f8aa5db9>] recv_msg+0x39/0x50 [tipc]
> [<c0375af2>] netif_receive_skb+0x172/0x1b0
> [<c0375bb4>] process_backlog+0x84/0x120
> [<c0375cd0>] net_rx_action+0x80/0x120
> [<c0124bc8>] __do_softirq+0xb8/0xc0
> [<c0124c05>] do_softirq+0x35/0x40
> [<c0107ce5>] do_IRQ+0x175/0x230
> [<c0105ce0>] common_interrupt+0x18/0x20
> [<c0221c91>] copy_from_user+0x1/0x80
> [<f8a866bf>] link_send_sections_long+0x30f/0xb30 [tipc]
> [<c0221694>] __delay+0x14/0x20
> [<f8a8366f>] link_schedule_port+0x13f/0x1e0 [tipc]
> [<f8a860f5>] link_send_sections_fast+0x5b5/0x870 [tipc]
> [<c011b12a>] __wake_up_common+0x3a/0x60
> [<f8a97bf2>] tipc_send+0x92/0x9d0 [tipc]
> [<c011d736>] __mmdrop+0x36/0x50
> [<c03f15b7>] schedule+0x467/0x7a0
> [<f8aa33e6>] recv_msg+0x2b6/0x560 [tipc]
> [<f8aa2d90>] send_packet+0x90/0x180 [tipc]
> [<c011b0d0>] default_wake_function+0x0/0x20
> [<c036c83e>] sock_sendmsg+0x8e/0xb0
> [<f8aa5db9>] recv_msg+0x39/0x50 [tipc]
> [<c01435ba>] buffered_rmqueue+0xfa/0x220
> [<c036c61a>] sockfd_lookup+0x1a/0x80
> [<c036dd61>] sys_sendto+0xe1/0x100
> [<c0128f62>] del_timer_sync+0x42/0x140
> [<c036d109>] sock_poll+0x29/0x30
> [<c017884b>] do_pollfd+0x5b/0xa0
> [<c036ddb6>] sys_send+0x36/0x40
> [<c036e60e>] sys_socketcall+0x12e/0x240
> [<c0105373>] syscall_call+0x7/0xb
>                                   
>&node->lock lock timeout
>Call Trace:
> [<f8a8549a>] link_send+0xda/0x2a0 [tipc]
> [<f8a92cee>] net_route_msg+0x41e/0x43d [tipc]
> [<f8a949c2>] port_send_proto_msg+0x1a2/0x2a0 [tipc]
> [<f8a95983>] port_abort_peer+0x83/0x90 [tipc]
> [<f8a9458f>] tipc_deleteport+0x19f/0x280 [tipc]
> [<f8aa25b2>] release+0x72/0x130 [tipc]
> [<c036c76b>] sock_release+0x7b/0xc0
> [<c036d176>] sock_close+0x36/0x50
> [<c016315a>] __fput+0x10a/0x120
> [<c0161597>] filp_close+0x57/0x90
> [<c0121dbc>] put_files_struct+0x7c/0xf0
> [<c0122d5a>] do_exit+0x23a/0x5a0
> [<c012aa35>] __dequeue_signal+0xf5/0x1b0
> [<c0123240>] do_group_exit+0xe0/0x150
> [<c012ab1d>] dequeue_signal+0x2d/0x90
> [<c012cbef>] get_signal_to_deliver+0x26f/0x510
> [<c0105136>] do_signal+0xb6/0xf0
> [<c036ddb6>] sys_send+0x36/0x40
> [<c036e60e>] sys_socketcall+0x12e/0x240
> [<c01051cb>] do_notify_resume+0x5b/0x5d
> [<c01053be>] work_notifysig+0x13/0x15
>
>  
>

[Tipc-discussion] spinlock deadlock

From: Mark H. <ma...@os...> - 2004-05-12 18:24:38

Here is what looks to be happening with the spin lock deadlock.  I
replaced all the spin_lock_bh calls with a wrapper that tries to get the
lock for a while then prints out a debug message if it can't get the
lock.

As an experiment, I changed the spin_lock_bh in link_wakeup_ports to a
trylock and exited if it couldn't get the lock.  I am now not able to
get the deadlock.


CPU 0:
release -- 
   tipc_delete_port (get port lock) --
      port_abort_peer --
         port_send_proto_msg --
            net_route_msg --
               link_send (get node lock) -- (hung spinning)

CPU 1:
common_interrupt --
   do_softirq --
      net_rx_action --
         netif_receive_skb --
            recv_msg (tipc eth) --
               tipc_recv_msg (get node lock) --
                  link_wakeup_ports (get port lock) -- (hung spinning)

Stack dumps:

port lock timeout
Call Trace:
 [<f8a837ab>] link_wakeup_ports+0x9b/0x230 [tipc]
 [<f8a87c2e>] tipc_recv_msg+0x7fe/0x8c0 [tipc]
 [<c014949d>] __kmalloc+0x19d/0x250
 [<f8aa5db9>] recv_msg+0x39/0x50 [tipc]
 [<c0375af2>] netif_receive_skb+0x172/0x1b0
 [<c0375bb4>] process_backlog+0x84/0x120
 [<c0375cd0>] net_rx_action+0x80/0x120
 [<c0124bc8>] __do_softirq+0xb8/0xc0
 [<c0124c05>] do_softirq+0x35/0x40
 [<c0107ce5>] do_IRQ+0x175/0x230
 [<c0105ce0>] common_interrupt+0x18/0x20
 [<c0221c91>] copy_from_user+0x1/0x80
 [<f8a866bf>] link_send_sections_long+0x30f/0xb30 [tipc]
 [<c0221694>] __delay+0x14/0x20
 [<f8a8366f>] link_schedule_port+0x13f/0x1e0 [tipc]
 [<f8a860f5>] link_send_sections_fast+0x5b5/0x870 [tipc]
 [<c011b12a>] __wake_up_common+0x3a/0x60
 [<f8a97bf2>] tipc_send+0x92/0x9d0 [tipc]
 [<c011d736>] __mmdrop+0x36/0x50
 [<c03f15b7>] schedule+0x467/0x7a0
 [<f8aa33e6>] recv_msg+0x2b6/0x560 [tipc]
 [<f8aa2d90>] send_packet+0x90/0x180 [tipc]
 [<c011b0d0>] default_wake_function+0x0/0x20
 [<c036c83e>] sock_sendmsg+0x8e/0xb0
 [<f8aa5db9>] recv_msg+0x39/0x50 [tipc]
 [<c01435ba>] buffered_rmqueue+0xfa/0x220
 [<c036c61a>] sockfd_lookup+0x1a/0x80
 [<c036dd61>] sys_sendto+0xe1/0x100
 [<c0128f62>] del_timer_sync+0x42/0x140
 [<c036d109>] sock_poll+0x29/0x30
 [<c017884b>] do_pollfd+0x5b/0xa0
 [<c036ddb6>] sys_send+0x36/0x40
 [<c036e60e>] sys_socketcall+0x12e/0x240
 [<c0105373>] syscall_call+0x7/0xb
                                   
&node->lock lock timeout
Call Trace:
 [<f8a8549a>] link_send+0xda/0x2a0 [tipc]
 [<f8a92cee>] net_route_msg+0x41e/0x43d [tipc]
 [<f8a949c2>] port_send_proto_msg+0x1a2/0x2a0 [tipc]
 [<f8a95983>] port_abort_peer+0x83/0x90 [tipc]
 [<f8a9458f>] tipc_deleteport+0x19f/0x280 [tipc]
 [<f8aa25b2>] release+0x72/0x130 [tipc]
 [<c036c76b>] sock_release+0x7b/0xc0
 [<c036d176>] sock_close+0x36/0x50
 [<c016315a>] __fput+0x10a/0x120
 [<c0161597>] filp_close+0x57/0x90
 [<c0121dbc>] put_files_struct+0x7c/0xf0
 [<c0122d5a>] do_exit+0x23a/0x5a0
 [<c012aa35>] __dequeue_signal+0xf5/0x1b0
 [<c0123240>] do_group_exit+0xe0/0x150
 [<c012ab1d>] dequeue_signal+0x2d/0x90
 [<c012cbef>] get_signal_to_deliver+0x26f/0x510
 [<c0105136>] do_signal+0xb6/0xf0
 [<c036ddb6>] sys_send+0x36/0x40
 [<c036e60e>] sys_socketcall+0x12e/0x240
 [<c01051cb>] do_notify_resume+0x5b/0x5d
 [<c01053be>] work_notifysig+0x13/0x15

-- 
Mark Haverkamp <ma...@os...>

Re: [Tipc-discussion] Re: hang while deleting ports

From: Jon M. <jon...@er...> - 2004-05-12 14:55:10

Well, we do have bottom halves (timer and signal tasklets) that
may interfere with message sending/reception if executed, 
even on UPs. I am sure there are cases  when the _bh() version 
is not really needed, but I wanted to keep it safe when I  developed it.

Something we can look into later, maybe.

/jon

Mark Haverkamp wrote:


On Wed, 2004-05-12 at 07:15, Jon Maloy wrote:

  

Mark,

I think you can easily verify your hypothesis by replacing 

spin_lock_bh() with a spin_trylock_bh() in tipc_recv_msg(), and

then just discard the packet and continue the loop if it fails.

(If this happens the packet will be retransmitted anyway.)

It is worth a try.

    



I'll give it a try.



Out of curiosity, why are you using the *_bh versions of spinlock?

Re: [Tipc-discussion] Re: hang while deleting ports

From: Mark H. <ma...@os...> - 2004-05-12 14:21:12

On Wed, 2004-05-12 at 07:15, Jon Maloy wrote:
> Mark,
> I think you can easily verify your hypothesis by replacing 
> spin_lock_bh() with a spin_trylock_bh() in tipc_recv_msg(), and
> then just discard the packet and continue the loop if it fails.
> (If this happens the packet will be retransmitted anyway.)
> It is worth a try.

I'll give it a try.

Out of curiosity, why are you using the *_bh versions of spinlock?
-- 
Mark Haverkamp <ma...@os...>

Re: [Tipc-discussion] Re: hang while deleting ports

From: Mark H. <ma...@os...> - 2004-05-12 14:18:49

On Tue, 2004-05-11 at 16:22, Jon Maloy wrote:
> I am not sure. In your first analysis of the problem you 
> found that it was a dying process that was causing the
> processor to hang while trying to grab the node lock, which
> was then presumably held by the tipc_rev_msg interrupt,
> or a tasklet.
> That is the opposite scenario of what you describe here.

I see the hang both ways.

> 
> Also, this was no different in the old implementation:
> we shared the same lock between user level/tasklet code
> and interrupts without any problems.  If you are right,
> why don't we have deadlocks all the time then ?

That is a good point, but it does seem a possibility.

> 
> I think I will re-read Rusty Lynch "unreliable guide" 
> once more...

That is Rusty Russell.


-- 
Mark Haverkamp <ma...@os...>

Re: [Tipc-discussion] Re: hang while deleting ports

From: Jon M. <jon...@er...> - 2004-05-12 14:15:58

Mark,
I think you can easily verify your hypothesis by replacing 
spin_lock_bh() with a spin_trylock_bh() in tipc_recv_msg(), and
then just discard the packet and continue the loop if it fails.
(If this happens the packet will be retransmitted anyway.)
It is worth a try.

/Jon

Mark Haverkamp wrote:


On Thu, 2004-05-06 at 16:37, Jon Maloy wrote:

  

I think your analysis is correct, but I don't know where this omission

happens. The problem I am trying to solve may be related,  

- when I run parallel links between two nodes comunication on

one link sometimes seem to stop under heavy load.



With a little luck we are looking for the same bug.



/Jon



    



Jon,



I was thinking about this today and it occurred to me that spin_lock_bh

doesn't prevent interrupts from happening.  If true, we can get in a

deadlock situation when a CPU has the node lock, an ethernet interrupt

happens causing tipc_recv_msg to get called.  One of the first things

that tipc_recv_msg does is try to get the node lock.  This seems to be a

possible explanation for the spin hang on the node lock.  Does this make

sense to you?  



Mark.







  

Mark Haverkamp wrote:



    

Jon,



Daniel and I have been seeing a tipc hang for 3 or 4 weeks when we kill

a running application in a certain order.



While running the tipc benchmark program we can get tipc to hang the

computer by killing the client while it has the 32 processes running.

Although, to get the hang, I have to have tried to run some management

port accesses which are stalled due to congestion.  After doing some

tracing, I have narrowed it down to an exiting process spinning while

trying to get the node lock.  Our assumption is that some other process

hasn't released the lock by accident, although its not obvious where.  I

have included the stack dump from the sysrq P console command.



SysRq : Show Regs

                 

Pid: 2001, comm:       client_tipc_tp

EIP: 0060:[<f8a913d9>] CPU: 0

EIP is at .text.lock.link+0xd7/0x3ce [tipc]

EFLAGS: 00000286    Not tainted  (2.6.6-rc3)

EAX: f7c8ef6c EBX: 00000000 ECX: 01001011 EDX: 00000013

ESI: f7c8eee0 EDI: f359a000 EBP: f359bcf8 DS: 007b ES: 007b

CR0: 8005003b CR2: 080e2ce8 CR3: 0053d000 CR4: 000006d0

Call Trace:

[<c0126a38>] __do_softirq+0xb8/0xc0

[<f8a9818b>] net_route_msg+0x48b/0x4ad [tipc]

[<c015b3a1>] __pte_chain_free+0x81/0x90

[<f8a99e6e>] port_send_proto_msg+0x1ae/0x2d0 [tipc]

[<f8a9af73>] port_abort_peer+0x83/0x90 [tipc]

[<f8a999a1>] tipc_deleteport+0x181/0x2a0 [tipc]

[<f8aa7ae2>] release+0x72/0x130 [tipc]

[<c0378ff9>] sock_release+0x99/0xf0

[<c0379a16>] sock_close+0x36/0x50

[<c016740d>] __fput+0x12d/0x140

[<c0165857>] filp_close+0x57/0x90

[<c0123adc>] put_files_struct+0x7c/0xf0

[<c0124b1c>] do_exit+0x26c/0x600

[<c012cc05>] __dequeue_signal+0xf5/0x1b0

[<c0125057>] do_group_exit+0x107/0x190

[<c012cced>] dequeue_signal+0x2d/0x90

[<c012f14c>] get_signal_to_deliver+0x28c/0x590

[<c0105286>] do_signal+0xb6/0xf0

[<c037a736>] sys_send+0x36/0x40

[<c037af8e>] sys_socketcall+0x12e/0x240

[<c010531b>] do_notify_resume+0x5b/0x5d

[<c010554a>] work_notifysig+0x13/0x15



You can see that the process is trying to exit. I have traced the EIP to

the spin_lock_bh(&node->lock) in link_lock_select from a disassembly of

link.o.



Any ideas on this?



Mark.

 



      





-------------------------------------------------------

This SF.Net email is sponsored by Sleepycat Software

Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to
deliver

higher performing products faster, at low TCO.

http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
<http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3> 

_______________________________________________

TIPC-discussion mailing list

TIP...@li...
<mailto:TIP...@li...> 

https://lists.sourceforge.net/lists/listinfo/tipc-discussion
<https://lists.sourceforge.net/lists/listinfo/tipc-discussion>

Re: [Tipc-discussion] Re: hang while deleting ports

From: Jon M. <jon...@er...> - 2004-05-11 23:22:49

I am not sure. In your first analysis of the problem you 
found that it was a dying process that was causing the
processor to hang while trying to grab the node lock, which
was then presumably held by the tipc_rev_msg interrupt,
or a tasklet.
That is the opposite scenario of what you describe here.

Also, this was no different in the old implementation:
we shared the same lock between user level/tasklet code
and interrupts without any problems.  If you are right,
why don't we have deadlocks all the time then ?

I think I will re-read Rusty Lynch "unreliable guide" 
once more...

Regards /Jon

Mark Haverkamp wrote:


On Thu, 2004-05-06 at 16:37, Jon Maloy wrote:

  

I think your analysis is correct, but I don't know where this omission

happens. The problem I am trying to solve may be related,  

- when I run parallel links between two nodes comunication on

one link sometimes seem to stop under heavy load.



With a little luck we are looking for the same bug.



/Jon



    



Jon,



I was thinking about this today and it occurred to me that spin_lock_bh

doesn't prevent interrupts from happening.  If true, we can get in a

deadlock situation when a CPU has the node lock, an ethernet interrupt

happens causing tipc_recv_msg to get called.  One of the first things

that tipc_recv_msg does is try to get the node lock.  This seems to be a

possible explanation for the spin hang on the node lock.  Does this make

sense to you?  



Mark.







  

Mark Haverkamp wrote:



    

Jon,



Daniel and I have been seeing a tipc hang for 3 or 4 weeks when we kill

a running application in a certain order.



While running the tipc benchmark program we can get tipc to hang the

computer by killing the client while it has the 32 processes running.

Although, to get the hang, I have to have tried to run some management

port accesses which are stalled due to congestion.  After doing some

tracing, I have narrowed it down to an exiting process spinning while

trying to get the node lock.  Our assumption is that some other process

hasn't released the lock by accident, although its not obvious where.  I

have included the stack dump from the sysrq P console command.



SysRq : Show Regs

                 

Pid: 2001, comm:       client_tipc_tp

EIP: 0060:[<f8a913d9>] CPU: 0

EIP is at .text.lock.link+0xd7/0x3ce [tipc]

EFLAGS: 00000286    Not tainted  (2.6.6-rc3)

EAX: f7c8ef6c EBX: 00000000 ECX: 01001011 EDX: 00000013

ESI: f7c8eee0 EDI: f359a000 EBP: f359bcf8 DS: 007b ES: 007b

CR0: 8005003b CR2: 080e2ce8 CR3: 0053d000 CR4: 000006d0

Call Trace:

[<c0126a38>] __do_softirq+0xb8/0xc0

[<f8a9818b>] net_route_msg+0x48b/0x4ad [tipc]

[<c015b3a1>] __pte_chain_free+0x81/0x90

[<f8a99e6e>] port_send_proto_msg+0x1ae/0x2d0 [tipc]

[<f8a9af73>] port_abort_peer+0x83/0x90 [tipc]

[<f8a999a1>] tipc_deleteport+0x181/0x2a0 [tipc]

[<f8aa7ae2>] release+0x72/0x130 [tipc]

[<c0378ff9>] sock_release+0x99/0xf0

[<c0379a16>] sock_close+0x36/0x50

[<c016740d>] __fput+0x12d/0x140

[<c0165857>] filp_close+0x57/0x90

[<c0123adc>] put_files_struct+0x7c/0xf0

[<c0124b1c>] do_exit+0x26c/0x600

[<c012cc05>] __dequeue_signal+0xf5/0x1b0

[<c0125057>] do_group_exit+0x107/0x190

[<c012cced>] dequeue_signal+0x2d/0x90

[<c012f14c>] get_signal_to_deliver+0x28c/0x590

[<c0105286>] do_signal+0xb6/0xf0

[<c037a736>] sys_send+0x36/0x40

[<c037af8e>] sys_socketcall+0x12e/0x240

[<c010531b>] do_notify_resume+0x5b/0x5d

[<c010554a>] work_notifysig+0x13/0x15



You can see that the process is trying to exit. I have traced the EIP to

the spin_lock_bh(&node->lock) in link_lock_select from a disassembly of

link.o.



Any ideas on this?



Mark.

 



      





-------------------------------------------------------

This SF.Net email is sponsored by Sleepycat Software

Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to
deliver

higher performing products faster, at low TCO.

http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
<http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3> 

_______________________________________________

TIPC-discussion mailing list

TIP...@li...
<mailto:TIP...@li...> 

https://lists.sourceforge.net/lists/listinfo/tipc-discussion
<https://lists.sourceforge.net/lists/listinfo/tipc-discussion>

Re: [Tipc-discussion] New file release tipc-1.3.10

From: Jon M. <jon...@er...> - 2004-05-11 23:07:39

I realize the problem, but I have difficulties with seeing 
how we can do otherwise if we want lock granularity 
per node instance. 
I tried to make it clear via the function names, but it can certainly 
be improved by better comments on this. (I already have such 
comments at some places.)  
I think improved comments is the best we can do, unless somebody 
has a  better idea.

Regards /Jon

Daniel McNeil wrote:


On Thu, 2004-04-29 at 08:15, Jon Maloy wrote:

  

Do you suggest that we delay the LKML-announcement until we have done

these changes ? To replace all the linked lists is intrusive, and it

may 

take weeks re-stabilize the code. Otherwise, see my comments below.



Regards /jon



    



Jon,



One other thing about TIPC that Mark and I have noticed that might

cause mainline acceptance problems is the:



	ref_lock_acquire(void *object,spinlock_t ** lock)

	ref_lock_deref()



	namesub_lock_deref()

	ref_unlock_discard(s->publ.s.ref);





	port_lock_deref()

	spin_unlock_bh(this->publ.lock);

	ref_unlock_discard(ref);







2 big concerns:



1. This hides the locking since a pointer is set to the "real" lock.

   The usage above just does not seem clear on what is being locked.

  

2. The lock is locked in one function and returned with the spinlock

   held and then unlock() in a different function.



Maybe more comments on functions saying which locks are assumed held

and which locks are unlocked that came in locked might clear up some

of the confusion.



Thanks,



Daniel

Re: [Tipc-discussion] Re: hang while deleting ports

From: Mark H. <ma...@os...> - 2004-05-11 22:24:14

On Thu, 2004-05-06 at 16:37, Jon Maloy wrote:
> I think your analysis is correct, but I don't know where this omission
> happens. The problem I am trying to solve may be related,  
> - when I run parallel links between two nodes comunication on
> one link sometimes seem to stop under heavy load.
> 
> With a little luck we are looking for the same bug.
> 
> /Jon
> 

Jon,

I was thinking about this today and it occurred to me that spin_lock_bh
doesn't prevent interrupts from happening.  If true, we can get in a
deadlock situation when a CPU has the node lock, an ethernet interrupt
happens causing tipc_recv_msg to get called.  One of the first things
that tipc_recv_msg does is try to get the node lock.  This seems to be a
possible explanation for the spin hang on the node lock.  Does this make
sense to you?  

Mark.



> 
> Mark Haverkamp wrote:
> 
> >Jon,
> >
> >Daniel and I have been seeing a tipc hang for 3 or 4 weeks when we kill
> >a running application in a certain order.
> >
> >While running the tipc benchmark program we can get tipc to hang the
> >computer by killing the client while it has the 32 processes running.
> >Although, to get the hang, I have to have tried to run some management
> >port accesses which are stalled due to congestion.  After doing some
> >tracing, I have narrowed it down to an exiting process spinning while
> >trying to get the node lock.  Our assumption is that some other process
> >hasn't released the lock by accident, although its not obvious where.  I
> >have included the stack dump from the sysrq P console command.
> >
> >SysRq : Show Regs
> >                  
> >Pid: 2001, comm:       client_tipc_tp
> >EIP: 0060:[<f8a913d9>] CPU: 0
> >EIP is at .text.lock.link+0xd7/0x3ce [tipc]
> > EFLAGS: 00000286    Not tainted  (2.6.6-rc3)
> >EAX: f7c8ef6c EBX: 00000000 ECX: 01001011 EDX: 00000013
> >ESI: f7c8eee0 EDI: f359a000 EBP: f359bcf8 DS: 007b ES: 007b
> >CR0: 8005003b CR2: 080e2ce8 CR3: 0053d000 CR4: 000006d0
> >Call Trace:
> > [<c0126a38>] __do_softirq+0xb8/0xc0
> > [<f8a9818b>] net_route_msg+0x48b/0x4ad [tipc]
> > [<c015b3a1>] __pte_chain_free+0x81/0x90
> > [<f8a99e6e>] port_send_proto_msg+0x1ae/0x2d0 [tipc]
> > [<f8a9af73>] port_abort_peer+0x83/0x90 [tipc]
> > [<f8a999a1>] tipc_deleteport+0x181/0x2a0 [tipc]
> > [<f8aa7ae2>] release+0x72/0x130 [tipc]
> > [<c0378ff9>] sock_release+0x99/0xf0
> > [<c0379a16>] sock_close+0x36/0x50
> > [<c016740d>] __fput+0x12d/0x140
> > [<c0165857>] filp_close+0x57/0x90
> > [<c0123adc>] put_files_struct+0x7c/0xf0
> > [<c0124b1c>] do_exit+0x26c/0x600
> > [<c012cc05>] __dequeue_signal+0xf5/0x1b0
> > [<c0125057>] do_group_exit+0x107/0x190
> > [<c012cced>] dequeue_signal+0x2d/0x90
> > [<c012f14c>] get_signal_to_deliver+0x28c/0x590
> > [<c0105286>] do_signal+0xb6/0xf0
> > [<c037a736>] sys_send+0x36/0x40
> > [<c037af8e>] sys_socketcall+0x12e/0x240
> > [<c010531b>] do_notify_resume+0x5b/0x5d
> > [<c010554a>] work_notifysig+0x13/0x15
> >
> >You can see that the process is trying to exit. I have traced the EIP to
> >the spin_lock_bh(&node->lock) in link_lock_select from a disassembly of
> >link.o.
> >
> >Any ideas on this?
> >
> >Mark.
> >  
> >
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by Sleepycat Software
> Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver
> higher performing products faster, at low TCO.
> http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
> _______________________________________________
> TIPC-discussion mailing list
> TIP...@li...
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
-- 
Mark Haverkamp <ma...@os...>

Re: [Tipc-discussion] New file release tipc-1.3.10

From: Daniel M. <da...@os...> - 2004-05-11 21:59:45

On Thu, 2004-04-29 at 08:15, Jon Maloy wrote:
> Do you suggest that we delay the LKML-announcement until we have done
> these changes ? To replace all the linked lists is intrusive, and it
> may 
> take weeks re-stabilize the code. Otherwise, see my comments below.
> 
> Regards /jon
> 

Jon,

One other thing about TIPC that Mark and I have noticed that might
cause mainline acceptance problems is the:

	ref_lock_acquire(void *object,spinlock_t ** lock)
	ref_lock_deref()

	namesub_lock_deref()
	ref_unlock_discard(s->publ.s.ref);

	port_lock_deref()
	spin_unlock_bh(this->publ.lock);
	ref_unlock_discard(ref);

2 big concerns:

1. This hides the locking since a pointer is set to the "real" lock.
   The usage above just does not seem clear on what is being locked.

2. The lock is locked in one function and returned with the spinlock
   held and then unlock() in a different function.

Maybe more comments on functions saying which locks are assumed held
and which locks are unlocked that came in locked might clear up some
of the confusion.

Thanks,

Daniel

451 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 587 588 589 590 591 .. 595 > >> (Page 589 of 595)