Re: [DIGImend-devel] PT-1001 digitizer
Brought to you by:
spb_nick
|
From: Yann V. <yan...@or...> - 2015-01-28 10:43:06
|
On Mon, 26 Jan 2015 10:33:16 +0200 Nikolai Kondrashov <sp...@gm...> wrote: > Hi Yann, > > Could you please keep the maillist in the CC? Thank you. Certainly. I made a mistake when hitting reply, for which I apologize. It was not my intent to exclude the list. > On 01/26/2015 01:01 AM, Yann Vernier wrote: > > On Sun, 25 Jan 2015 16:41:04 +0200 > > Nikolai Kondrashov <sp...@gm...> wrote: > >> On 01/20/2015 03:46 PM, Yann Vernier wrote: > >>> Possible workarounds include patching in that extra byte or requesting more > >>> (both tested and work). A proper fix would probably include not trying to > >>> parse more descriptor data than the device actually returned. > >> > >> Well, such fix still wouldn't work with devices which would get a more > >> important part of their report descriptor lost. Then, making the parser handle > >> all the possible failures it introduces would make it too complicated. > > > > Yes, attempting to guess the missing data is a bad idea. The current behaviour > > of not even noticing that it is missing isn't really much better, however. > > However, if I just make sure to request a larger size I get the correct amount. > > Hmm, from your description it seems to me that Linux notices it is missing and > actively drops the interface with a bad descriptor, doesn't it? Or did I > understand you wrong? It only drops it after failing to parse uninitialized data. drivers/hid/usbhid/hid-core.c usbhid_parse and hid_post_reset both call hid_get_class_descriptor, which does check the size, but gives up after four tries. Even so, having received some data it returns the incomplete descriptor, i.e. instead of error (negative) it returns rsize-1 (as the last byte is missing). It never attempted a larger read, so it did not work around the bug in the PT-1001. In usbhid_parse (and hid_post_reset), the return value is only checked for negative, not for wrong size, so the error is lost at this point. This subtle bug changes the meaning of the received descriptor from incomplete (truncated) to corrupted (padded, in this case with zero, but kmalloc does not guarantee that). Since the error was lost, Linux proceeds to hid_parse_report. As the last byte was never received, it has an erroneous value, and that's when the parsing fails. Linux attempted to parse a byte that did not come from the device. Had the descriptor been different up to that point, the corruption could have gone unnoticed, producing arbitrary results. Linux only detected a problem because a C0 was required. > >>> I do suggest requesting more than the descriptor length by default, since > >>> this bug may be present in other devices (possibly as a misinterpretation of > >>> USB's use of a full frame to indicate more data is available). My own test > >>> was simply to add 1 to rsize and dev_rsize in kmalloc and > >>> hid_get_class_descriptor calls within drivers/hid/usbhid/hid-core.c, but > >>> that doesn't check how much was actually received. > >> > >> This would be one controversial solution. How much more would we need to > >> request? Would this encourage more sloppy HID descriptors in the future? When > >> should we stop requesting more and more? > > > > The descriptor isn't actually corrupt, only the transfer is incomplete. > > The descriptor is 317 bytes. > > Linux requests 317 bytes, and receives 316 bytes because of the device bug. > > Linux does not check the size received and therefore not only fails in > > parsing the descriptor but reports a different problem (failing to parse a > > byte that was never received in the first place). > > Windows requests 317+64 bytes, and receives 317 bytes. > > If I make Linux request 317+1 bytes, I get 317 bytes, forming the complete > > descriptor. > > > > The main reason I believe we could safely request a larger size from most > > HID devices is that Windows apparently already does. Then again, I would > > not be terribly surprised if a modern console used that behaviour to > > trigger weirdness (we've seen them abuse USB before). An alternative that > > could reasonably work is just attempting rsize+1 if we get an incomplete > > descriptor on the first try. The advantage is that it would only affect > > devices that were displaying bad behaviour in the first place. > > Sorry, I might have misunderstood you. I'll need to read the Linux USB code to > talk about this. > >>> That's not the end of the story, however. Even with proper parsing of the > >>> descriptors it seems the tablet only has one interface that reports a stylus, > >>> and it's not the one sending events. This causes the Xorg driver to deduce it > >>> is a touchpad, not a pen tablet, with all the resulting defects (relative > >>> mode, ignoring events when the nib isn't pressed). It also reports a rather > >>> mystifying set of buttons (from xinput list-props, but do come from kernel, > >>> as confirmed using input-events): > >>> Button Labels (276): "Button Left" (145), "Button Middle" (146), > >>> "Button Right" (147), "Button Wheel Up" (148), "Button Wheel Down" (149), > >>> "Button Horiz Wheel Left" (150), "Button Horiz Wheel Right" (151), > >>> "Button Side" (265), "Button Extra" (266), "Button Forward" (714), > >>> "Button Back" (715), "Button Task" (716), "Button Unknown" (264), > >>> "Button Unknown" (264), "Button Unknown" (264), "Button Unknown" (264) > >>> The second pointer device reported, which does not produce events, has an > >>> equally useless button map: > >>> "Button 0" (630), "Button Unknown" (264), "Button Unknown" (264), > >>> "Button Wheel Up" (148), "Button Wheel Down" (149) > >>> > >>> The actual buttons (nib and two side buttons on the stylus) are those > >>> reported as Forward, Back and Task. Remapping them with xinput set-button-map > >>> and setting absolute mode makes it usable but with the serious issue of not > >>> moving the pointer if the nib isn't pressed (the pressure level works in > >>> inkscape). I did not find a way to tell Xorg's evdev driver that the device > >>> is a tablet. > >> > >> Windows report descriptor parsing works quite differently from Linux and such > >> mapping issues happen often. > > > > I suspect there's actually a parsing bug in Linux here, as the kernel values > > barely have any similarities to the descriptors themselves. The first five on > > interface 1 match (being the boot protocol mouse buttons), as do the axis, > > but none of the other buttons do - interface 2 doesn't even have five > > buttons (it has four). I might try to figure out what the kernel HID parser > > did later. > > I think it is intentional, although presents a Windows-compatibility problem. > Basically, IIRC, if Linux encounters a field with usage which should be mapped > to the same event as a field previously seen in the same report descriptor, it > assigns this field a higher event number. This might be limited to some event > types. The HID_QUIRK_MULTI_INPUT quirk is used to make the driver create > separate event devices for different report IDs to avoid this. Interesting. MULTI_INPUT might be appropriate to separate mouse and tablet mode in the PT-1001, as they produce different reports. The strange part, comparing to the report descriptors, is that the button meanings generally don't match up at all, even on the second interface where all buttons have usages and none collide. The anti-collission support one might expect would allocate different generic buttons rather than six copies of "unknown" sprinkled among a rather random selection. I have attached an initial draft of a pt1001 descriptor patch driver, but it still has some issues: For some reason, hid-generic still takes priority when choosing driver. I have to unbind hid-generic to use this driver. The button usages still don't show through, and worse, the stylus tip and second barrel (reported as eraser) button both show as button 1. Using input-events seems to explain this as them being reported as touch, stylus and button 0; the expected set was stylus, barrel and eraser. With this button setup xinput mapping cannot separate them. multi_input did help in that both tablet and mouse mode work, and the modified descriptor (with digitizer rather than pointer) makes tablet mode function correctly except for the button SNAFU. I will probably pick this little project up again after FOSDEM. |