From: Oleg D. <gr...@li...> - 2002-12-22 19:23:40
|
Jeff Dike <jd...@ka...> wrote: JD> I fixed the dead xterm race and exported os_getpid. Now that this one is gone, how about looking again at race in ubd, that is when ubd serves same request twice (in SMP mode only). I still have no confirmation that this is present in 2.4, but it is certainly in there for 2.5. To refresh your memory: do_ubd variable is used as a lock against two threads servicing ubd request queue. do_ubd is accessed/modified non atomically, so it is a window between checking and updating this variable, and therefore it is possible another thread to not notice that another one is already doing its job. Also the possibility of two threads running is simple: Suppose we just emptied our queue but have not exited ubd_handler() yet. Now another request arrives and do_ubd_request() is called. It checks do_ubd that was already cleared in ubd_handler(). Now now ubd_handler() thread finishes and calls do_ubd_request() too. That one also checks do_ubd that is still zero, and updates it. Now the do_ubd_request() called by VFS also updates do_ubd. We now have two threads servicing same request and second one will fail on "I/O op mismatch" assertion in ubd_handler(). My proposed fix is to convert misleading do_ubd thing that is only used for locking (at least in 2.5) to a lock with atomic access. Here is my proposed fix for 2.5: ===== arch/um/drivers/ubd_kern.c 1.17 vs edited ===== --- 1.17/arch/um/drivers/ubd_kern.c Mon Oct 21 11:16:57 2002 +++ edited/arch/um/drivers/ubd_kern.c Wed Oct 30 18:14:09 2002 @@ -29,6 +29,7 @@ #include "linux/blkpg.h" #include "linux/genhd.h" #include "linux/spinlock.h" +#include "linux/bitops.h" #include "asm/segment.h" #include "asm/uaccess.h" #include "asm/irq.h" @@ -48,7 +49,10 @@ static spinlock_t ubd_io_lock = SPIN_LOCK_UNLOCKED; static spinlock_t ubd_lock = SPIN_LOCK_UNLOCKED; -static void (*do_ubd)(void); +/* We set this when we asked io thread to do some work, + by using this flag we can avoid do_ubd_request to schedule + io more then once for any given request. (race seen on SMP) */ +static long ubd_servicing; static int ubd_open(struct inode * inode, struct file * filp); static int ubd_release(struct inode * inode, struct file * file); @@ -374,7 +378,6 @@ struct request *rq = elv_next_request(&ubd_queue); int n; - do_ubd = NULL; intr_count++; n = read_ubd_fs(thread_fd, &req, sizeof(req)); if(n != sizeof(req)){ @@ -383,6 +386,7 @@ spin_lock(&ubd_io_lock); end_request(rq, 0); spin_unlock(&ubd_io_lock); + clear_bit(1, &ubd_servicing); return; } @@ -391,6 +395,7 @@ panic("I/O op mismatch"); ubd_finish(rq, req.error); + clear_bit(1, &ubd_servicing); reactivate_fd(thread_fd, UBD_IRQ); do_ubd_request(&ubd_queue); } @@ -829,16 +834,21 @@ } } else { - if(do_ubd || list_empty(&q->queue_head)) return; + /* if there is no requests or if another thread have + already started async io - return */ + if(list_empty(&q->queue_head) || + test_and_set_bit(1, &ubd_servicing)) return; + req = elv_next_request(q); err = prepare_request(req, &io_req); if(!err){ - do_ubd = ubd_handler; n = write_ubd_fs(thread_fd, (char *) &io_req, sizeof(io_req)); if(n != sizeof(io_req)) printk("write to io thread failed, " "errno = %d\n", -n); + } else { + clear_bit(1, &ubd_servicing); } } } Bye, Oleg |