|
From: Ludovic R. <lud...@gm...> - 2010-06-24 16:31:40
|
2010/6/24 Alan Stern <st...@ro...>: > On Thu, 24 Jun 2010, Ludovic Rousseau wrote: > >> Hello, >> >> I try to stress my code by sending the same command on a loop. But I >> may be hit by a kernel bug or libusb bug. >> I am using libusb git version up-to-date and kernel 2.6.34-1-686 or >> vmlinuz-2.6.32-5-686 (from Debian). Both kernels have the problem. >> >> libusb returns -99 (LIBUSB_ERROR_OTHER) and the kernel logs a line in >> /var/log/syslog: >> Jun 24 15:08:01 trinity kernel: [ 299.676480] usb 3-1.3: usbfs: >> usb_submit_urb returned -121 >> >> I also used my previous kernel 2.6.26-2-686 from Debian stable and >> libusb-1.0.8 and I DO NOT have the problem. So it looks a bug in the >> recent kernels or in the way libusb is using the kernel. > > It is a bug in libusb. It is caused by interaction with a kernel > feature that was added not long ago. I forget when exactly, but after > 2.6.26. An already known bug. Cool. >> Is it a known issue? >> Is it already solved in 2.6.35-rc3? >> What can I do to help find the bug (easy to reproduce for me)? > > Once he's aware of it, Daniel will probably know exactly where it is > and how to fix it. Daniel, can I help you? Do you need more traces? >> As explained in the message I generated a usbmon trace: >> >> f0141500 2572976529 S Bo:3:005:2 -115 11 = 6b010000 0000e400 000002 >> f0141500 2572977025 C Bo:3:005:2 0 11 > >> f0141c80 2572977228 S Bi:3:005:2 -115 16384 < >> f0141c00 2572977267 S Bi:3:005:2 -115 16384 < >> f0141180 2572977305 S Bi:3:005:2 -115 16384 < >> f0141780 2572977344 S Bi:3:005:2 -115 16384 < >> f0141480 2572977385 S Bi:3:005:2 -115 20 < >> f0141c80 2572978068 C Bi:3:005:2 -121 28 = 83120000 0000e402 00004765 >> 6d547769 6e2d5632 2e30302d 47463033 >> f0141c00 2572978119 C Bi:3:005:2 -104 0 >> f0141180 2572978168 C Bi:3:005:2 -104 0 >> f0141780 2572978212 C Bi:3:005:2 -104 0 >> f0141480 2572978213 C Bi:3:005:2 -104 0 >> >> The first command with no error >> >> f01a7800 2573003416 S Bo:3:005:2 -115 11 = 6b010000 0000e500 000002 >> f01a7800 2573005031 C Bo:3:005:2 0 11 > >> f01a7800 2573005562 S Bi:3:005:2 -115 16384 < >> f01a7f80 2573005634 S Bi:3:005:2 -115 16384 < >> f01a7c80 2573005862 S Bi:3:005:2 -115 16384 < >> f01a7500 2573005928 S Bi:3:005:2 -115 16384 < >> f01a7800 2573006073 C Bi:3:005:2 -121 28 = 83120000 0000e502 00004765 >> 6d547769 6e2d5632 2e30302d 47463033 >> f01a7f80 2573006134 C Bi:3:005:2 -104 0 >> f01a7c80 2573006190 C Bi:3:005:2 -104 0 >> f01a7500 2573006251 C Bi:3:005:2 -104 0 >> >> The second command with an error reported by the kernel. But I can't >> find any difference with the previous and successful command, except a >> line is missing at the end. > > That's the difference. In both cases you submitted a bulk-IN transfer > request for 65556 bytes. Internally libusb converts the transfer into > four 16384-byte URBs followed by a fifth 20-byte URB. However the > device replied with only 28 bytes of data. > > In the first case, all five URBs were submitted before the short reply > arrived, so the first URB returned with an error code indicating a > short packet and the others were cancelled. > > In the second case, only four of the URBs were submitted before the > short reply arrived. The first URB returned with an error code > indicating a short packet and the other three were cancelled. Then > when libusb tried to submit the fifth URB, the submission was rejected > because the transfer had already completed. Libusb should have taken > this rejection in stride instead of reporting it as an error. Great explanation. Thanks a lot. Regards, -- Dr. Ludovic Rousseau |