From: Graeme G. <gr...@ar...> - 2010-04-26 15:33:14
|
I'm trying to track down a tricky race problem in libusb V1. The symptoms are that a synchronous operation completes successfully, but takes its timeout period to return, rather than returning as soon as the transfer is complete. The scenario seems to be the following: Thread A sets up the transfer, and adds an fd to the list to be monitored. It then calls io.c libusb_handle_events_timeout() to wait for completion. In libusb_handle_events_timeout() it gets the next timeout, which is for the transfer it just created. Before it gets to libusb_try_lock_events(), another thread (B) runs, and the transfer completes. B handles the transfer completion, and removes the fd from the list. Thread A then continues, and calls libusb_try_lock_events() which succeeds ("doing our own event handling"), where it then waits for its transfer to complete, even though its already completed and been removed from the list. Eventually it times out and notices that it's finished, and returns success. Attached is a trace. I've added extra debugging. "th xxxx" is the thread ID. "tm xxxx" is the msec timestamp since the process creation. I reworked it to keep each message in one piece, rather than being broken up by the alternating threads. The back end is libusb0.sys. I'm not entirely sure if the problem is specific to that back end, or whether the timing changes enough to make it more evident on that back end. I did notice that the problem became harder and harder to reproduce, the more debugging messages I added. Thread A is 1572, Thread B is 2780. The overall transfer takes 2605 msec, whereas it should normally take < 10msec. I'm not that familiar with how it all should be working in this circumstance. Any ideas on what next to look for, or how it should be fixed ? [It seems to me at the moment that libusb_handle_events_timeout() needs the pointer to the libusb_transfer passed into it, so that after obtaining the event lock it can check if the transfer it is interested in is still outstanding, so that it can avoid calling handle_events(). But then I don't claim to understand it all that well yet.] Graeme Gill. |