Thread: [SSI-devel] An amusing bug. But what on earth should it be doing.
Brought to you by:
brucewalker,
rogertsang
From: John H. <john@Calva.COM> - 2008-04-17 15:21:44
|
Investigating a kernel BUG report, unlocking a lock that was not locked, bug #1944781 I find the following code in cluster/ssi/ipc/ipcshm_svr.c: int ssi_shm_cleanup(clusternode_t svrnode, int id, clusternode_t clinode) { [...] if (svrnode == this_node) { [...] ret = do_ssi_shm_noclients(id, svrnode, 2); if (ret) { [...] /* drop remote locks because has attachs */ abort_rmid(id); Interesting. What remote locks might it be talking about? void abort_rmid(int id) { [...] nl = NSC_NODELIST_COPY(svp->shm_nodelist); [...] while ((node = NSC_NODELIST_GET_NEXT(&cookie, nl)) != CLUSTERNODE_INVAL) { if (node == this_node) continue; ret = RIPC_DROP_LOCKS(node, &rval, id); } NSC_NODELIST_FREE(nl); So it's going to call the ripc_drop_locks function on all nodes that have processes attached to the shm segment: int ripc_drop_locks( clusternode_t node, int *rval, int id) { struct shmid_kernel *shp = shm_cli_get(id); if (shp) shp->shm_perm.mode &= ~SHM_LOCK_DEST; ipc_drop_locks(id, (struct kern_ipc_perm *)shp, &shm_ids, 1); *rval = 0; return 0; } In ipc/util.c: void ipc_drop_locks(int id, struct kern_ipc_perm *perm, struct ipc_ids *ids, int table) { if (id > 0) ipc_unlock(perm); [...] } And: void ipc_unlock(struct kern_ipc_perm* perm) { spin_unlock(&perm->lock); rcu_read_unlock(); } So if id is > 0 it's going to call spin_unlock. But it didn't call spin_lock, so *BUG*. Why should (despite it's name) ripc_drop_locks call ipc_drop_locks? |
From: Walker, B. J (HP-Labs) <bru...@hp...> - 2008-04-17 17:50:11
|
Just looking at the code you sent, the unlock is dependent on id > 0 and id is a param to ssi_shm_cleanup so it is possible that ssi_shm_cleanup is called with id > 0 if a spinlock was already acquired and otherwise not. Bruce -----Original Message----- From: ssi...@li... [mailto:ssi...@li...] On Behalf Of John Hughes Sent: Thursday, April 17, 2008 8:21 AM To: ssic-linux-devel Subject: [SSI-devel] An amusing bug. But what on earth should it be doing. Investigating a kernel BUG report, unlocking a lock that was not locked, bug #1944781 I find the following code in cluster/ssi/ipc/ipcshm_svr.c: int ssi_shm_cleanup(clusternode_t svrnode, int id, clusternode_t clinode) { [...] if (svrnode == this_node) { [...] ret = do_ssi_shm_noclients(id, svrnode, 2); if (ret) { [...] /* drop remote locks because has attachs */ abort_rmid(id); Interesting. What remote locks might it be talking about? void abort_rmid(int id) { [...] nl = NSC_NODELIST_COPY(svp->shm_nodelist); [...] while ((node = NSC_NODELIST_GET_NEXT(&cookie, nl)) != CLUSTERNODE_INVAL) { if (node == this_node) continue; ret = RIPC_DROP_LOCKS(node, &rval, id); } NSC_NODELIST_FREE(nl); So it's going to call the ripc_drop_locks function on all nodes that have processes attached to the shm segment: int ripc_drop_locks( clusternode_t node, int *rval, int id) { struct shmid_kernel *shp = shm_cli_get(id); if (shp) shp->shm_perm.mode &= ~SHM_LOCK_DEST; ipc_drop_locks(id, (struct kern_ipc_perm *)shp, &shm_ids, 1); *rval = 0; return 0; } In ipc/util.c: void ipc_drop_locks(int id, struct kern_ipc_perm *perm, struct ipc_ids *ids, int table) { if (id > 0) ipc_unlock(perm); [...] } And: void ipc_unlock(struct kern_ipc_perm* perm) { spin_unlock(&perm->lock); rcu_read_unlock(); } So if id is > 0 it's going to call spin_unlock. But it didn't call spin_lock, so *BUG*. Why should (despite it's name) ripc_drop_locks call ipc_drop_locks? ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ ssic-linux-devel mailing list ssi...@li... https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel |
From: John H. <john@Calva.COM> - 2008-04-18 08:52:27
|
Walker, Bruce J (HP-Labs) wrote: > Just looking at the code you sent, the unlock is dependent on id > 0 and id is a param to ssi_shm_cleanup so it is possible that ssi_shm_cleanup is called with id > 0 if a spinlock was already acquired and otherwise not. > Yeah, but this is a cross-node IPC. There should be no way one node could take a spinlock on another node in one IPC and release it in another. |
From: John H. <john@Calva.COM> - 2008-04-18 16:32:27
|
John Hughes wrote: > Investigating a kernel BUG report, unlocking a lock that was not locked, > bug #1944781 I find the following code in cluster/ssi/ipc/ipcshm_svr.c: > [...] > So if id is > 0 it's going to call spin_unlock. But it didn't call > spin_lock, so *BUG*. > > Why should (despite it's name) ripc_drop_locks call ipc_drop_locks? > Well, it should, but it only wants ipc_drop_locks to do the "up ()" operation, not the "ipc_unlock()" so it should call ipc_drop_locks with id "0". |