Thread: receiving an evt_timeout error code with Ricoh R5C832 PCIe controller
Brought to you by:
aeb,
bencollins
From: Jarvis S. <sch...@gm...> - 2012-08-01 04:41:19
|
00:00.0 Host bridge: Intel Corporation Core Processor DMI (rev 11) 00:03.0 PCI bridge: Intel Corporation Core Processor PCI Express Root Port 1 (rev 11) 00:08.0 System peripheral: Intel Corporation Core Processor System Management Registers (rev 11) 00:08.1 System peripheral: Intel Corporation Core Processor Semaphore and Scratchpad Registers (rev 11) 00:08.2 System peripheral: Intel Corporation Core Processor System Control and Status Registers (rev 11) 00:08.3 System peripheral: Intel Corporation Core Processor Miscellaneous Registers (rev 11) 00:10.0 System peripheral: Intel Corporation Core Processor QPI Link (rev 11) 00:10.1 System peripheral: Intel Corporation Core Processor QPI Routing and Protocol Registers (rev 11) 00:1a.0 USB controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05) 00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 05) 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05) 00:1c.1 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 2 (rev 05) 00:1c.3 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 4 (rev 05) 00:1c.4 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 5 (rev 05) 00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 6 (rev 05) 00:1d.0 USB controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev a5) 00:1f.0 ISA bridge: Intel Corporation Mobile 5 Series Chipset LPC Interface Controller (rev 05) 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 05) 00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller (rev 05) 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV730 XT [Mobility Radeon HD 4670] 02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI RV710/730 HDMI Audio [Radeon HD 4000 series] 05:00.0 Network controller: Intel Corporation Ultimate N WiFi Link 5300 09:00.0 SD Host controller: Ricoh Co Ltd MMC/SD Host Controller (rev 01) 09:00.1 System peripheral: Ricoh Co Ltd Memory Stick Host Controller (rev 01) 09:00.2 System peripheral: Ricoh Co Ltd Device e852 (rev 01) 09:00.3 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 PCIe IEEE 1394 Controller (rev 01) 0b:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5784M Gigabit Ethernet PCIe (rev 10) ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-Core Registers (rev 04) ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 04) ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 04) ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 04) ff:03.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller (rev 04) ff:03.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Target Address Decoder (rev 04) ff:03.4 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Test Registers (rev 04) ff:04.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Control Registers (rev 04) ff:04.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Address Registers (rev 04) ff:04.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Rank Registers (rev 04) ff:04.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 0 Thermal Control Registers (rev 04) ff:05.0 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Control Registers (rev 04) ff:05.1 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Address Registers (rev 04) ff:05.2 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Rank Registers (rev 04) ff:05.3 Host bridge: Intel Corporation Core Processor Integrated Memory Controller Channel 1 Thermal Control Registers (rev 04) |
From: Stefan R. <st...@s5...> - 2012-08-02 06:54:29
|
On Jul 31 Jarvis Schultz wrote: > I'm trying to use the Phantom Omni haptic device, and the apps keep > freezing. The driver developers insist that the drivers work stably on all > of the linux distros that they have tested them on. Phantom Omni's developers possibly don't have a Ricoh PCIe controller, or do they? > Yet, both firewire > computers I have are producing the exact same results. I noticed that in > the change log for kernel version 3.3 there was mention of some bug fixes > for the Ricoh PCIe 1394 controllers. I'm wondering if it is possible that > I am experiencing some of these downstream fixed bugs? The Ricoh PCIe controller related change in kernel 3.3 simply consists of using classic interrupts instead of MSI (http://en.wikipedia.org/wiki/Message_Signaled_Interrupts). http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=320cfa6ce0b3dc794fedfa4bae54c0f65077234d This change was also put into kernel 3.2.6, released on 13 February 2012. http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commitdiff;h=c1a1e15fd6fe7ed496d115ac9b87649e4d827d65 Your kernel "3.2.0-27" is possibly based on 3.2.6 or later. What does "grep firewire /proc/interrupts" show? The "disable MSI" quirk can also be switched on by means of a module parameter to firewire-ohci if you have a kernel without the built-in quirk table entry. As root user: # modprobe -r firewire-ohci # modprobe firewire-ohci quirks=17 Vice versa, if the quirk is built in, it can be switched off by a different value to the module parameter: # modprobe -r firewire-ohci # modprobe firewire-ohci quirks=1 (A note for other readers and for the mailinglist archive: This module parameter is not a stable API and does some fairly low-level changes. For this and other reasons this module parameter is not normally meant to be used by end users, and particularly not meant for permanent use.) > Occasionally when connecting the device I get a message that says > > firewire_ohci: isochronous cycle inconsistent This is quite common when the bus topology changes. I wonder if we should normally suppress this message. If the drivers are not yet active at the very moment when you connect the device, you can disregard this message. Or are they? > Every application that I run works for a few seconds, and then just > freezes. The message printed in dmesg every time an app freezes is > > firewire_ohci: DMA context IT0 has stopped, error code: evt_timeout This indicates that drivers use isochronous transmission DMA (among else), i.e. a stream of packets from the PC to the device at a steady rate of 8000 packets per second. This is probably the carrier for the force feedback functionality. The message is logged when an interrupt signaled to the CPU that one or more DMA stopped due to an unrecoverable error. If they do, then the drivers are quite likely to be affected by controller hardware bugs to a similar extent as the FFADO audio drivers. http://subversion.ffado.org/wiki/HostControllers gives you an expression how many controller types are actually very unreliable as soon as I/O gets a little bit different from the usual SBP-2 storage or DV capture applications. > System Details: > Kernel version - 3.2.0-27 What do you get from "grep PREEMPT /boot/config-$(uname -r)" and from "cat /proc/interrupts"? > Kernel drivers, adapter, card and OHCI chipset - > 09:00.3 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 PCIe IEEE 1394 Controller (rev 01) (prog-if 10 [OHCI]) > Kernel driver in use: firewire_ohci > Kernel modules: firewire-ohci Could you send the chip IDs as "lspci -nn | grep 1394" shows them? > Phantom Omni PDD drivers version = Linux JUJU PDD pre-beta > > libraw1394 version - 2.0.5 Are the Phantom Omni drivers kernel drivers or userspace drivers? If the latter, are they using libraw1394? You could for example check with "ldd /path/to/the/driver/binary.xy" which lists all libraries to which 'binary.xy' is dynamically linked to. Libraries which are statically linked into a binary ( = incorporated into the binary) won't show up this way though. The term "JUJU" implies that the drivers are not using libraw1394. But if they do, you should try to update to libraw1394 2.0.7 or later because it has got changes which affect isochronous streaming. Current and stable is 2.1.0. Of course an update of the system-wide installed libraw1394 would only have effect if the Phantom Omni drivers are dynamically linked to it. http://www.sensable.com/support-download-pdd.htm looks like the Phantom Omni drivers are not free, not open source, and not even made available to anybody but customers of theirs. In which case it would be hard to find out what kinds of I/O these drivers need and which userspace/ kernelspace code paths they go through, obviously. -- Stefan Richter -=====-===-- =--- ---=- http://arcgraph.de/sr/ |
From: Jarvis S. <sch...@gm...> - 2012-08-02 19:06:39
|
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 46 0 0 307778 0 0 0 0 IO-APIC-edge timer 1: 1 0 0 10 0 0 0 0 IO-APIC-edge i8042 8: 0 0 0 1 0 0 0 0 IO-APIC-edge rtc0 9: 28992 0 0 1079 0 0 0 0 IO-APIC-fasteoi acpi 10: 2245 0 0 341 10162 0 0 0 IO-APIC-edge ite-cir 12: 1 0 0 172 0 0 0 0 IO-APIC-edge i8042 16: 0 23 0 0 0 0 94 0 IO-APIC-fasteoi ehci_hcd:usb1, mmc0 19: 3049 0 0 0 0 0 0 21 IO-APIC-fasteoi 23: 29203 0 0 0 0 0 0 408 IO-APIC-fasteoi ehci_hcd:usb2 40: 314187 0 0 0 0 0 0 0 HPET_MSI-edge hpet2 41: 0 290033 0 0 0 0 0 0 HPET_MSI-edge hpet3 42: 0 0 292723 0 0 0 0 0 HPET_MSI-edge hpet4 43: 0 0 0 276075 0 0 0 0 HPET_MSI-edge hpet5 44: 0 0 0 0 156037 0 0 0 HPET_MSI-edge hpet6 45: 108515 0 0 0 0 0 0 0 PCI-MSI-edge ahci 46: 257229 0 0 0 0 0 0 0 PCI-MSI-edge iwlwifi 47: 209 276 0 0 0 0 0 0 PCI-MSI-edge snd_hda_intel 48: 0 0 319 0 0 0 0 0 PCI-MSI-edge snd_hda_intel 49: 0 0 3 0 0 0 0 0 PCI-MSI-edge eth0 50: 0 0 0 5 0 0 0 0 PCI-MSI-edge fglrx[0]@PCI:2:0:0 51: 0 0 0 4213 0 0 0 0 PCI-MSI-edge firewire_ohci NMI: 152 103 94 108 55 55 58 58 Non-maskable interrupts LOC: 262 263 237 211 185 175542 185641 190251 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 152 103 94 108 55 55 58 58 Performance monitoring interrupts IWI: 0 0 0 0 0 0 0 0 IRQ work interrupts RES: 347731 398976 341506 360340 126306 136364 136631 111185 Rescheduling interrupts CAL: 848 1252 1283 1300 1236 1270 914 1288 Function call interrupts TLB: 21666 17918 21093 14404 14252 9846 11215 11708 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 20 20 20 20 20 20 20 20 Machine check polls ERR: 0 MIS: 0 |
From: Stefan R. <st...@s5...> - 2012-08-02 22:12:47
|
On Aug 02 Jarvis Schultz wrote: > On Thu, Aug 2, 2012 at 1:54 AM, Stefan Richter <st...@s5...> > wrote: > > Your kernel "3.2.0-27" is possibly based on 3.2.6 or later. What does > > "grep firewire /proc/interrupts" show? > > > Here is the output of that: > 19: 3043 0 0 0 0 0 > 0 21 IO-APIC-fasteoi firewire_ohci I.e. your kernel has got the "don't use MSI on Ricoh OHCI-1394" flag already in its built in table. [...] > > Vice versa, if the quirk is built in, it can be switched off > > by a different value to the module parameter: [...] > I tried both of these "quirk" levels, and there is no change to the > behavior. Interesting; this means we could narrow the built-in quirk down to the chip version which was originally reported as broken with MSI, and let chip versions like yours use MSI again. But this doesn't help you one bit, alas. [...] > >> System Details: > >> Kernel version - 3.2.0-27 [...] > grep command: > # CONFIG_PREEMPT_RCU is not set > CONFIG_PREEMPT_NOTIFIERS=y > # CONFIG_PREEMPT_NONE is not set > CONFIG_PREEMPT_VOLUNTARY=y > # CONFIG_PREEMPT is not set Have a look whether your distribution also offers a kernel package which is called "lowlatency" or "desktop" or something along these lines --- and in effect has got CONFIG_PREEMPT=y and "# CONFIG_PREEMPT_VOLUNTARY is not set". The CONFIG_PREEMPT option is found under "Processor type and features -> Preemption Model -> Preemptible Kernel (Low-Latency Desktop)" in the kernel configuration menu when kernel sources are configured as preparation of a kernel build from source. CONFIG_PREEMPT means that kernel threads are put to sleep by the CPU scheduler when it used up its time slice and another process is waiting to get CPU time, very much the same way as the CPU scheduler treats userspace processes when they used up their time slice. I am suggesting to try such a kernel because there is a slight chance that reduced scheduling latency helps the Phantom Omni drivers to queue FireWire packets more steadily, which in turn might reduce the likelihood that Ricoh controller stops its DMA unrecoverably. > I attached my /proc/interrupts file as interrupts_log.txt Looks inconspicuous to me. Just one thing: Have you tried the mainline "radeon" driver instead of the closed source fglrx driver already? 3rd party kernel drivers tend to be latency sources due to lack of independent review and tie-in into mainline development. [...] > > Could you send the chip IDs as "lspci -nn | grep 1394" shows them? > > > Here it is: > 09:00.3 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C832 PCIe IEEE 1394 > Controller [1180:e832] (rev 01) Thanks for this. Same PCI ID as in the report which led to kernel commit 320cfa6ce0b3, but rev 01 instead of rev 00. > >> Phantom Omni PDD drivers version = Linux JUJU PDD pre-beta > >> > >> libraw1394 version - 2.0.5 > > > > Are the Phantom Omni drivers kernel drivers or userspace drivers? If the > > latter, are they using libraw1394? You could for example check with > > "ldd /path/to/the/driver/binary.xy" which lists all libraries to which > > 'binary.xy' is dynamically linked to. Libraries which are statically > > linked into a binary ( = incorporated into the binary) won't show up this > > way though. > > > The drivers are userspace drivers, and they are distributed as just a > sequence of shared libraries. The old drivers that they released depended > on the raw1394 kernel module. These new "pre-beta JUJU" drivers do not. I > originally tried to get this working with their old drivers, and followed > the instructions at this site: > https://wiki.sofa-framework.org/tdev/wiki/HowTo/SensableWithoutRaw1394 to > create a "dummy" kernel module. Argh. :-) Nitpick: "ln /dev/null /dev/raw1394" probably would have worked just as well as "ln /dev/fw0 /dev/raw1394". /dev/fw* does not support any of the ioctl/write/read operations that are valid on /dev/raw1394. Anyway, the former necessity of this dummy module and dummy file is a fine showcase of the joys and wonders of non-free software. > I am not positive I was getting the same > error messages, but I was getting the same "freezing" behavior. > > > The term "JUJU" implies that the drivers are not using libraw1394. But if > > they do, you should try to update to libraw1394 2.0.7 or later because it > > has got changes which affect isochronous streaming. Current and stable is > > 2.1.0. Of course an update of the system-wide installed libraw1394 would > > only have effect if the Phantom Omni drivers are dynamically linked to it. > > > I am a little confused by their terminology. Their binaries are absolutely > linked to libraw1394 libraries. I believe they are using the term JUJU > because they no longer use the kernel module raw1394. They never directly used the kernel module, they used libraw1394. (Which is good because the developers of raw1394 intended the raw1394 kernel interface to be private to the raw1394<-->libraw1394 pair. This in turn facilitated the transition to the binary incompatible <linux/firewire-cdev.h> kernel interface provided by firewire-core while retaining compatibility with libraw1394 based application software. libraw1394 v2 transparently uses raw1394's or firewire-core's mutually incompatible interfaces depending on what libraw1394 detects to be available at runtime.) So this "JUJU" designation either refers to the accomplishment of removing that silly check for presence of raw1394, or it means that they added a new <linux/firewire-cdev.h> backend but are still keeping a libraw1394 based backend as an alternative or for parts of their functionality. Or it means that they noticed that libraw1394 does not behave 100% the same when running on top of firewire-core versus raw1394, and added some workarounds in their own code (instead of submitting fixes to upstream libraw1394 or kernel, or at least reporting such issues). I guess one could find out more about it by running it all in a debugger or by inserting some simple debug logging into libraw1394. But I don't think at this point that it would get us nearer to a fix for the Ricoh problem (or what I believe to be a Ricoh problem). > Not really sure if > this is correct. I just downloaded and compiled libraw1394 2.1.0, and made > sure that the drivers were finding the new versions that I installed (using > ldd and ldconfig). I still get the same error message. OK. Either the libraw1394 2.0.7 change which I alluded to simply does not have a positive effect here ---- which is not surprising since that change was implemented to address a different problem than yours ---, or the drivers bypass libraw1394 when running on top of firewire-core, or both. > > http://www.sensable.com/support-download-pdd.htm looks like the Phantom > > Omni drivers are not free, not open source, and not even made available > > to anybody but customers of theirs. In which case it would be hard to > > find out what kinds of I/O these drivers need and which userspace/ > > kernelspace code paths they go through, obviously. > > You are absolutely correct. I have to provide them with my device serial > number, just to access the driver download, and even then, I only get > binaries and libraries, not source code. I guess it's not quite > "OpenHaptics"... Well, at least they do provide something for Linux users among their customers, although in a not particularly pragmatic form. -- Stefan Richter -=====-===-- =--- ---== http://arcgraph.de/sr/ |
From: Jarvis S. <sch...@gm...> - 2012-08-03 01:49:28
|
On Thu, Aug 2, 2012 at 5:12 PM, Stefan Richter <st...@s5...> wrote: > > On Aug 02 Jarvis Schultz wrote: > > On Thu, Aug 2, 2012 at 1:54 AM, Stefan Richter > > <st...@s5...> > > wrote: > > > Your kernel "3.2.0-27" is possibly based on 3.2.6 or later. What does > > > "grep firewire /proc/interrupts" show? > > > > > Here is the output of that: > > 19: 3043 0 0 0 0 0 > > 0 21 IO-APIC-fasteoi firewire_ohci > > I.e. your kernel has got the "don't use MSI on Ricoh OHCI-1394" flag > already in its built in table. > > [...] > > > Vice versa, if the quirk is built in, it can be switched off > > > by a different value to the module parameter: > [...] > > I tried both of these "quirk" levels, and there is no change to the > > behavior. > > Interesting; this means we could narrow the built-in quirk down to the > chip version which was originally reported as broken with MSI, and let > chip versions like yours use MSI again. But this doesn't help you one > bit, alas. > > [...] > > >> System Details: > > >> Kernel version - 3.2.0-27 > [...] > > grep command: > > # CONFIG_PREEMPT_RCU is not set > > CONFIG_PREEMPT_NOTIFIERS=y > > # CONFIG_PREEMPT_NONE is not set > > CONFIG_PREEMPT_VOLUNTARY=y > > # CONFIG_PREEMPT is not set > > Have a look whether your distribution also offers a kernel package which > is called "lowlatency" or "desktop" or something along these lines --- and > in effect has got CONFIG_PREEMPT=y and "# CONFIG_PREEMPT_VOLUNTARY is not > set". The CONFIG_PREEMPT option is found under "Processor type and > features -> Preemption Model -> Preemptible Kernel (Low-Latency Desktop)" > in the kernel configuration menu when kernel sources are configured as > preparation of a kernel build from source. > > CONFIG_PREEMPT means that kernel threads are put to sleep by the CPU > scheduler when it used up its time slice and another process is waiting to > get CPU time, very much the same way as the CPU scheduler treats userspace > processes when they used up their time slice. > > I am suggesting to try such a kernel because there is a slight chance that > reduced scheduling latency helps the Phantom Omni drivers to queue > FireWire packets more steadily, which in turn might reduce the likelihood > that Ricoh controller stops its DMA unrecoverably. > I just ran installed the most up-to-date lowlatency kernel that Ubuntu 12.04 has which was built around 3.2.0-23. To address the next comment, I booted into a console-only-mode, and made sure the fglrx drivers weren't loaded. I saw the same behavior as before. Also, I tried both kernels with the open source radeon drivers in a full desktop session, and repeatedly received the same errors. > > I attached my /proc/interrupts file as interrupts_log.txt > > Looks inconspicuous to me. Just one thing: Have you tried the mainline > "radeon" driver instead of the closed source fglrx driver already? 3rd > party kernel drivers tend to be latency sources due to lack of independent > review and tie-in into mainline development. > > [...] > > > Could you send the chip IDs as "lspci -nn | grep 1394" shows them? > > > > > Here it is: > > 09:00.3 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C832 PCIe IEEE 1394 > > Controller [1180:e832] (rev 01) > > Thanks for this. Same PCI ID as in the report which led to kernel commit > 320cfa6ce0b3, but rev 01 instead of rev 00. > > > >> Phantom Omni PDD drivers version = Linux JUJU PDD pre-beta > > >> > > >> libraw1394 version - 2.0.5 > > > > > > Are the Phantom Omni drivers kernel drivers or userspace drivers? If > > > the > > > latter, are they using libraw1394? You could for example check with > > > "ldd /path/to/the/driver/binary.xy" which lists all libraries to which > > > 'binary.xy' is dynamically linked to. Libraries which are statically > > > linked into a binary ( = incorporated into the binary) won't show up > > > this > > > way though. > > > > > The drivers are userspace drivers, and they are distributed as just a > > sequence of shared libraries. The old drivers that they released > > depended > > on the raw1394 kernel module. These new "pre-beta JUJU" drivers do not. > > I > > originally tried to get this working with their old drivers, and > > followed > > the instructions at this site: > > https://wiki.sofa-framework.org/tdev/wiki/HowTo/SensableWithoutRaw1394 > > to > > create a "dummy" kernel module. > > Argh. :-) Nitpick: "ln /dev/null /dev/raw1394" probably would have worked > just as well as "ln /dev/fw0 /dev/raw1394". /dev/fw* does not support > any of the ioctl/write/read operations that are valid on /dev/raw1394. > Anyway, the former necessity of this dummy module and dummy file is a fine > showcase of the joys and wonders of non-free software. > > > I am not positive I was getting the same > > error messages, but I was getting the same "freezing" behavior. > > > > > The term "JUJU" implies that the drivers are not using libraw1394. > > > But if > > > they do, you should try to update to libraw1394 2.0.7 or later because > > > it > > > has got changes which affect isochronous streaming. Current and > > > stable is > > > 2.1.0. Of course an update of the system-wide installed libraw1394 > > > would > > > only have effect if the Phantom Omni drivers are dynamically linked to > > > it. > > > > > I am a little confused by their terminology. Their binaries are > > absolutely > > linked to libraw1394 libraries. I believe they are using the term JUJU > > because they no longer use the kernel module raw1394. > > They never directly used the kernel module, they used libraw1394. > (Which is good because the developers of raw1394 intended the raw1394 > kernel interface to be private to the raw1394<-->libraw1394 pair. This > in turn facilitated the transition to the binary incompatible > <linux/firewire-cdev.h> kernel interface provided by firewire-core > while retaining compatibility with libraw1394 based application > software. libraw1394 v2 transparently uses raw1394's or firewire-core's > mutually incompatible interfaces depending on what libraw1394 detects to > be > available at runtime.) > > So this "JUJU" designation either refers to the accomplishment of removing > that silly check for presence of raw1394, or it means that they added a > new > <linux/firewire-cdev.h> backend but are still keeping a libraw1394 based > backend as an alternative or for parts of their functionality. > > Or it means that they noticed that libraw1394 does not behave 100% the > same when running on top of firewire-core versus raw1394, and added some > workarounds in their own code (instead of submitting fixes to upstream > libraw1394 or kernel, or at least reporting such issues). > > I guess one could find out more about it by running it all in a debugger > or by inserting some simple debug logging into libraw1394. But I don't > think at this point that it would get us nearer to a fix for the Ricoh > problem (or what I believe to be a Ricoh problem). > > > Not really sure if > > this is correct. I just downloaded and compiled libraw1394 2.1.0, and > > made > > sure that the drivers were finding the new versions that I installed > > (using > > ldd and ldconfig). I still get the same error message. > > OK. Either the libraw1394 2.0.7 change which I alluded to simply does not > have a positive effect here ---- which is not surprising since that change > was implemented to address a different problem than yours ---, or the > drivers bypass libraw1394 when running on top of firewire-core, or both. > > > > http://www.sensable.com/support-download-pdd.htm looks like the > > > Phantom > > > Omni drivers are not free, not open source, and not even made > > > available > > > to anybody but customers of theirs. In which case it would be hard to > > > find out what kinds of I/O these drivers need and which userspace/ > > > kernelspace code paths they go through, obviously. > > > > You are absolutely correct. I have to provide them with my device > > serial > > number, just to access the driver download, and even then, I only get > > binaries and libraries, not source code. I guess it's not quite > > "OpenHaptics"... > > Well, at least they do provide something for Linux users among their > customers, although in a not particularly pragmatic form. That is absolutely right, the Linux support is definitely appreciated! As a final update, I just did some testing with our lab's server machine. The machine has the same version for the kernel, the drivers, the shared libraries the drivers depend on, and the application I have been using to test (i.e. quite similar software builds). The firewire controller on that machine is 06:0c.0 FireWire (IEEE 1394) [0c00]: Texas Instruments TSB43AB22A IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx] [104c:8023] Every test application that sensable/ OpenHaptics provides worked without any errors. > -- > Stefan Richter > -=====-===-- =--- ---== > http://arcgraph.de/sr/ |
From: Jarvis S. <sch...@gm...> - 2012-08-06 16:15:56
|
Just FYI, I gave the Phantom Omni forum a quick overview of the discussion that we have been having, and I just received a response. Here is a quote from one of their support team members: > Most of the firewire cards have some firmware logic in them to arbitrate the 1394 isochronous mode. The omni also operates in the isochronous mode. > If the firmware in the 1394 chip has some problem, then you are going to see it in someway of the other. Because, the isochronous mode is > completely initiated and managed by the 1394 controller. He then provided a link to some recommended cards that seem to work. I just ordered the cheapest recommended one, and I will report back with the results and the card info. On Thu, Aug 2, 2012 at 8:49 PM, Jarvis Schultz <sch...@gm...> wrote: > > On Thu, Aug 2, 2012 at 5:12 PM, Stefan Richter > <st...@s5...> wrote: > > > > On Aug 02 Jarvis Schultz wrote: > > > On Thu, Aug 2, 2012 at 1:54 AM, Stefan Richter > > > <st...@s5...> > > > wrote: > > > > Your kernel "3.2.0-27" is possibly based on 3.2.6 or later. What does > > > > "grep firewire /proc/interrupts" show? > > > > > > > Here is the output of that: > > > 19: 3043 0 0 0 0 0 > > > 0 21 IO-APIC-fasteoi firewire_ohci > > > > I.e. your kernel has got the "don't use MSI on Ricoh OHCI-1394" flag > > already in its built in table. > > > > [...] > > > > Vice versa, if the quirk is built in, it can be switched off > > > > by a different value to the module parameter: > > [...] > > > I tried both of these "quirk" levels, and there is no change to the > > > behavior. > > > > Interesting; this means we could narrow the built-in quirk down to the > > chip version which was originally reported as broken with MSI, and let > > chip versions like yours use MSI again. But this doesn't help you one > > bit, alas. > > > > [...] > > > >> System Details: > > > >> Kernel version - 3.2.0-27 > > [...] > > > grep command: > > > # CONFIG_PREEMPT_RCU is not set > > > CONFIG_PREEMPT_NOTIFIERS=y > > > # CONFIG_PREEMPT_NONE is not set > > > CONFIG_PREEMPT_VOLUNTARY=y > > > # CONFIG_PREEMPT is not set > > > > Have a look whether your distribution also offers a kernel package which > > is called "lowlatency" or "desktop" or something along these lines --- and > > in effect has got CONFIG_PREEMPT=y and "# CONFIG_PREEMPT_VOLUNTARY is not > > set". The CONFIG_PREEMPT option is found under "Processor type and > > features -> Preemption Model -> Preemptible Kernel (Low-Latency Desktop)" > > in the kernel configuration menu when kernel sources are configured as > > preparation of a kernel build from source. > > > > CONFIG_PREEMPT means that kernel threads are put to sleep by the CPU > > scheduler when it used up its time slice and another process is waiting to > > get CPU time, very much the same way as the CPU scheduler treats userspace > > processes when they used up their time slice. > > > > I am suggesting to try such a kernel because there is a slight chance that > > reduced scheduling latency helps the Phantom Omni drivers to queue > > FireWire packets more steadily, which in turn might reduce the likelihood > > that Ricoh controller stops its DMA unrecoverably. > > > I just ran installed the most up-to-date lowlatency kernel that Ubuntu > 12.04 has which was built around 3.2.0-23. To address the next > comment, I booted into a console-only-mode, and made sure the fglrx > drivers weren't loaded. I saw the same behavior as before. Also, I > tried both kernels with the open source radeon drivers in a full > desktop session, and repeatedly received the same errors. > > > I attached my /proc/interrupts file as interrupts_log.txt > > > > Looks inconspicuous to me. Just one thing: Have you tried the mainline > > "radeon" driver instead of the closed source fglrx driver already? 3rd > > party kernel drivers tend to be latency sources due to lack of independent > > review and tie-in into mainline development. > > > > [...] > > > > Could you send the chip IDs as "lspci -nn | grep 1394" shows them? > > > > > > > Here it is: > > > 09:00.3 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C832 PCIe IEEE 1394 > > > Controller [1180:e832] (rev 01) > > > > Thanks for this. Same PCI ID as in the report which led to kernel commit > > 320cfa6ce0b3, but rev 01 instead of rev 00. > > > > > >> Phantom Omni PDD drivers version = Linux JUJU PDD pre-beta > > > >> > > > >> libraw1394 version - 2.0.5 > > > > > > > > Are the Phantom Omni drivers kernel drivers or userspace drivers? If > > > > the > > > > latter, are they using libraw1394? You could for example check with > > > > "ldd /path/to/the/driver/binary.xy" which lists all libraries to which > > > > 'binary.xy' is dynamically linked to. Libraries which are statically > > > > linked into a binary ( = incorporated into the binary) won't show up > > > > this > > > > way though. > > > > > > > The drivers are userspace drivers, and they are distributed as just a > > > sequence of shared libraries. The old drivers that they released > > > depended > > > on the raw1394 kernel module. These new "pre-beta JUJU" drivers do not. > > > I > > > originally tried to get this working with their old drivers, and > > > followed > > > the instructions at this site: > > > https://wiki.sofa-framework.org/tdev/wiki/HowTo/SensableWithoutRaw1394 > > > to > > > create a "dummy" kernel module. > > > > Argh. :-) Nitpick: "ln /dev/null /dev/raw1394" probably would have worked > > just as well as "ln /dev/fw0 /dev/raw1394". /dev/fw* does not support > > any of the ioctl/write/read operations that are valid on /dev/raw1394. > > Anyway, the former necessity of this dummy module and dummy file is a fine > > showcase of the joys and wonders of non-free software. > > > > > I am not positive I was getting the same > > > error messages, but I was getting the same "freezing" behavior. > > > > > > > The term "JUJU" implies that the drivers are not using libraw1394. > > > > But if > > > > they do, you should try to update to libraw1394 2.0.7 or later because > > > > it > > > > has got changes which affect isochronous streaming. Current and > > > > stable is > > > > 2.1.0. Of course an update of the system-wide installed libraw1394 > > > > would > > > > only have effect if the Phantom Omni drivers are dynamically linked to > > > > it. > > > > > > > I am a little confused by their terminology. Their binaries are > > > absolutely > > > linked to libraw1394 libraries. I believe they are using the term JUJU > > > because they no longer use the kernel module raw1394. > > > > They never directly used the kernel module, they used libraw1394. > > (Which is good because the developers of raw1394 intended the raw1394 > > kernel interface to be private to the raw1394<-->libraw1394 pair. This > > in turn facilitated the transition to the binary incompatible > > <linux/firewire-cdev.h> kernel interface provided by firewire-core > > while retaining compatibility with libraw1394 based application > > software. libraw1394 v2 transparently uses raw1394's or firewire-core's > > mutually incompatible interfaces depending on what libraw1394 detects to > > be > > available at runtime.) > > > > So this "JUJU" designation either refers to the accomplishment of removing > > that silly check for presence of raw1394, or it means that they added a > > new > > <linux/firewire-cdev.h> backend but are still keeping a libraw1394 based > > backend as an alternative or for parts of their functionality. > > > > Or it means that they noticed that libraw1394 does not behave 100% the > > same when running on top of firewire-core versus raw1394, and added some > > workarounds in their own code (instead of submitting fixes to upstream > > libraw1394 or kernel, or at least reporting such issues). > > > > I guess one could find out more about it by running it all in a debugger > > or by inserting some simple debug logging into libraw1394. But I don't > > think at this point that it would get us nearer to a fix for the Ricoh > > problem (or what I believe to be a Ricoh problem). > > > > > Not really sure if > > > this is correct. I just downloaded and compiled libraw1394 2.1.0, and > > > made > > > sure that the drivers were finding the new versions that I installed > > > (using > > > ldd and ldconfig). I still get the same error message. > > > > OK. Either the libraw1394 2.0.7 change which I alluded to simply does not > > have a positive effect here ---- which is not surprising since that change > > was implemented to address a different problem than yours ---, or the > > drivers bypass libraw1394 when running on top of firewire-core, or both. > > > > > > http://www.sensable.com/support-download-pdd.htm looks like the > > > > Phantom > > > > Omni drivers are not free, not open source, and not even made > > > > available > > > > to anybody but customers of theirs. In which case it would be hard to > > > > find out what kinds of I/O these drivers need and which userspace/ > > > > kernelspace code paths they go through, obviously. > > > > > > You are absolutely correct. I have to provide them with my device > > > serial > > > number, just to access the driver download, and even then, I only get > > > binaries and libraries, not source code. I guess it's not quite > > > "OpenHaptics"... > > > > Well, at least they do provide something for Linux users among their > > customers, although in a not particularly pragmatic form. > That is absolutely right, the Linux support is definitely appreciated! > > As a final update, I just did some testing with our lab's server > machine. The machine has the same version for the kernel, the > drivers, the shared libraries the drivers depend on, and the > application I have been using to test (i.e. quite similar software > builds). The firewire controller on that machine is > > 06:0c.0 FireWire (IEEE 1394) [0c00]: Texas Instruments TSB43AB22A > IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx] [104c:8023] > > Every test application that sensable/ OpenHaptics provides worked > without any errors. > > > -- > > Stefan Richter > > -=====-===-- =--- ---== > > http://arcgraph.de/sr/ |
From: Jarvis S. <sch...@gm...> - 2012-08-24 23:49:42
|
I purchased this card from Amazon: http://www.amazon.com/gp/product/B003GAM68U/ref=oh_details_o00_s00_i00 It features the VT6315 chipset. After plugging my device in through this card, the problem disappeared. Not the ideal solution, but definitely solved my problem. Thanks for all the help! On Mon, Aug 6, 2012 at 11:15 AM, Jarvis Schultz <sch...@gm...>wrote: > Just FYI, I gave the Phantom Omni forum a quick overview of the > discussion that we have been having, and I just received a response. > Here is a quote from one of their support team members: > > > Most of the firewire cards have some firmware logic in them to arbitrate > the 1394 isochronous mode. The omni also operates in the isochronous mode. > > If the firmware in the 1394 chip has some problem, then you are going to > see it in someway of the other. Because, the isochronous mode is > > completely initiated and managed by the 1394 controller. > > He then provided a link to some recommended cards that seem to work. > I just ordered the cheapest recommended one, and I will report back > with the results and the card info. > > On Thu, Aug 2, 2012 at 8:49 PM, Jarvis Schultz <sch...@gm...> > wrote: > > > > On Thu, Aug 2, 2012 at 5:12 PM, Stefan Richter > > <st...@s5...> wrote: > > > > > > On Aug 02 Jarvis Schultz wrote: > > > > On Thu, Aug 2, 2012 at 1:54 AM, Stefan Richter > > > > <st...@s5...> > > > > wrote: > > > > > Your kernel "3.2.0-27" is possibly based on 3.2.6 or later. What > does > > > > > "grep firewire /proc/interrupts" show? > > > > > > > > > Here is the output of that: > > > > 19: 3043 0 0 0 0 > 0 > > > > 0 21 IO-APIC-fasteoi firewire_ohci > > > > > > I.e. your kernel has got the "don't use MSI on Ricoh OHCI-1394" flag > > > already in its built in table. > > > > > > [...] > > > > > Vice versa, if the quirk is built in, it can be switched off > > > > > by a different value to the module parameter: > > > [...] > > > > I tried both of these "quirk" levels, and there is no change to the > > > > behavior. > > > > > > Interesting; this means we could narrow the built-in quirk down to the > > > chip version which was originally reported as broken with MSI, and let > > > chip versions like yours use MSI again. But this doesn't help you one > > > bit, alas. > > > > > > [...] > > > > >> System Details: > > > > >> Kernel version - 3.2.0-27 > > > [...] > > > > grep command: > > > > # CONFIG_PREEMPT_RCU is not set > > > > CONFIG_PREEMPT_NOTIFIERS=y > > > > # CONFIG_PREEMPT_NONE is not set > > > > CONFIG_PREEMPT_VOLUNTARY=y > > > > # CONFIG_PREEMPT is not set > > > > > > Have a look whether your distribution also offers a kernel package > which > > > is called "lowlatency" or "desktop" or something along these lines --- > and > > > in effect has got CONFIG_PREEMPT=y and "# CONFIG_PREEMPT_VOLUNTARY is > not > > > set". The CONFIG_PREEMPT option is found under "Processor type and > > > features -> Preemption Model -> Preemptible Kernel (Low-Latency > Desktop)" > > > in the kernel configuration menu when kernel sources are configured as > > > preparation of a kernel build from source. > > > > > > CONFIG_PREEMPT means that kernel threads are put to sleep by the CPU > > > scheduler when it used up its time slice and another process is > waiting to > > > get CPU time, very much the same way as the CPU scheduler treats > userspace > > > processes when they used up their time slice. > > > > > > I am suggesting to try such a kernel because there is a slight chance > that > > > reduced scheduling latency helps the Phantom Omni drivers to queue > > > FireWire packets more steadily, which in turn might reduce the > likelihood > > > that Ricoh controller stops its DMA unrecoverably. > > > > > I just ran installed the most up-to-date lowlatency kernel that Ubuntu > > 12.04 has which was built around 3.2.0-23. To address the next > > comment, I booted into a console-only-mode, and made sure the fglrx > > drivers weren't loaded. I saw the same behavior as before. Also, I > > tried both kernels with the open source radeon drivers in a full > > desktop session, and repeatedly received the same errors. > > > > I attached my /proc/interrupts file as interrupts_log.txt > > > > > > Looks inconspicuous to me. Just one thing: Have you tried the > mainline > > > "radeon" driver instead of the closed source fglrx driver already? 3rd > > > party kernel drivers tend to be latency sources due to lack of > independent > > > review and tie-in into mainline development. > > > > > > [...] > > > > > Could you send the chip IDs as "lspci -nn | grep 1394" shows them? > > > > > > > > > Here it is: > > > > 09:00.3 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C832 PCIe IEEE > 1394 > > > > Controller [1180:e832] (rev 01) > > > > > > Thanks for this. Same PCI ID as in the report which led to kernel > commit > > > 320cfa6ce0b3, but rev 01 instead of rev 00. > > > > > > > >> Phantom Omni PDD drivers version = Linux JUJU PDD pre-beta > > > > >> > > > > >> libraw1394 version - 2.0.5 > > > > > > > > > > Are the Phantom Omni drivers kernel drivers or userspace drivers? > If > > > > > the > > > > > latter, are they using libraw1394? You could for example check > with > > > > > "ldd /path/to/the/driver/binary.xy" which lists all libraries to > which > > > > > 'binary.xy' is dynamically linked to. Libraries which are > statically > > > > > linked into a binary ( = incorporated into the binary) won't show > up > > > > > this > > > > > way though. > > > > > > > > > The drivers are userspace drivers, and they are distributed as just a > > > > sequence of shared libraries. The old drivers that they released > > > > depended > > > > on the raw1394 kernel module. These new "pre-beta JUJU" drivers do > not. > > > > I > > > > originally tried to get this working with their old drivers, and > > > > followed > > > > the instructions at this site: > > > > > https://wiki.sofa-framework.org/tdev/wiki/HowTo/SensableWithoutRaw1394 > > > > to > > > > create a "dummy" kernel module. > > > > > > Argh. :-) Nitpick: "ln /dev/null /dev/raw1394" probably would have > worked > > > just as well as "ln /dev/fw0 /dev/raw1394". /dev/fw* does not support > > > any of the ioctl/write/read operations that are valid on /dev/raw1394. > > > Anyway, the former necessity of this dummy module and dummy file is a > fine > > > showcase of the joys and wonders of non-free software. > > > > > > > I am not positive I was getting the same > > > > error messages, but I was getting the same "freezing" behavior. > > > > > > > > > The term "JUJU" implies that the drivers are not using libraw1394. > > > > > But if > > > > > they do, you should try to update to libraw1394 2.0.7 or later > because > > > > > it > > > > > has got changes which affect isochronous streaming. Current and > > > > > stable is > > > > > 2.1.0. Of course an update of the system-wide installed libraw1394 > > > > > would > > > > > only have effect if the Phantom Omni drivers are dynamically > linked to > > > > > it. > > > > > > > > > I am a little confused by their terminology. Their binaries are > > > > absolutely > > > > linked to libraw1394 libraries. I believe they are using the term > JUJU > > > > because they no longer use the kernel module raw1394. > > > > > > They never directly used the kernel module, they used libraw1394. > > > (Which is good because the developers of raw1394 intended the raw1394 > > > kernel interface to be private to the raw1394<-->libraw1394 pair. This > > > in turn facilitated the transition to the binary incompatible > > > <linux/firewire-cdev.h> kernel interface provided by firewire-core > > > while retaining compatibility with libraw1394 based application > > > software. libraw1394 v2 transparently uses raw1394's or > firewire-core's > > > mutually incompatible interfaces depending on what libraw1394 detects > to > > > be > > > available at runtime.) > > > > > > So this "JUJU" designation either refers to the accomplishment of > removing > > > that silly check for presence of raw1394, or it means that they added a > > > new > > > <linux/firewire-cdev.h> backend but are still keeping a libraw1394 > based > > > backend as an alternative or for parts of their functionality. > > > > > > Or it means that they noticed that libraw1394 does not behave 100% the > > > same when running on top of firewire-core versus raw1394, and added > some > > > workarounds in their own code (instead of submitting fixes to upstream > > > libraw1394 or kernel, or at least reporting such issues). > > > > > > I guess one could find out more about it by running it all in a > debugger > > > or by inserting some simple debug logging into libraw1394. But I don't > > > think at this point that it would get us nearer to a fix for the Ricoh > > > problem (or what I believe to be a Ricoh problem). > > > > > > > Not really sure if > > > > this is correct. I just downloaded and compiled libraw1394 2.1.0, > and > > > > made > > > > sure that the drivers were finding the new versions that I installed > > > > (using > > > > ldd and ldconfig). I still get the same error message. > > > > > > OK. Either the libraw1394 2.0.7 change which I alluded to simply does > not > > > have a positive effect here ---- which is not surprising since that > change > > > was implemented to address a different problem than yours ---, or the > > > drivers bypass libraw1394 when running on top of firewire-core, or > both. > > > > > > > > http://www.sensable.com/support-download-pdd.htm looks like the > > > > > Phantom > > > > > Omni drivers are not free, not open source, and not even made > > > > > available > > > > > to anybody but customers of theirs. In which case it would be > hard to > > > > > find out what kinds of I/O these drivers need and which userspace/ > > > > > kernelspace code paths they go through, obviously. > > > > > > > > You are absolutely correct. I have to provide them with my device > > > > serial > > > > number, just to access the driver download, and even then, I only get > > > > binaries and libraries, not source code. I guess it's not quite > > > > "OpenHaptics"... > > > > > > Well, at least they do provide something for Linux users among their > > > customers, although in a not particularly pragmatic form. > > That is absolutely right, the Linux support is definitely appreciated! > > > > As a final update, I just did some testing with our lab's server > > machine. The machine has the same version for the kernel, the > > drivers, the shared libraries the drivers depend on, and the > > application I have been using to test (i.e. quite similar software > > builds). The firewire controller on that machine is > > > > 06:0c.0 FireWire (IEEE 1394) [0c00]: Texas Instruments TSB43AB22A > > IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx] [104c:8023] > > > > Every test application that sensable/ OpenHaptics provides worked > > without any errors. > > > > > -- > > > Stefan Richter > > > -=====-===-- =--- ---== > > > http://arcgraph.de/sr/ > |