I am getting errors when using the NI_USB_HS device that I do not get when using the Agilent 82357b USB device & driver.
Attached is a simple program to query and read a GPIB device for the ID string 100 times.
This program runs without failure every time using the Agilent 82357b device, however; using the NI USB-HS device I frequently (say one in 30 times) get failures with the log showing kernel errors:
Feb 06 12:14:37 dirac kernel: usb 1-1.1: USB disconnect, device number 30
Feb 06 12:14:39 dirac kernel: usb 1-1.1: new high-speed USB device number 31 using xhci_hcd
Feb 06 12:14:39 dirac kernel: usb 1-1.1: New USB device found, idVendor=3923, idProduct=709b, bcdDevice= 1.01
Feb 06 12:14:39 dirac kernel: usb 1-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Feb 06 12:14:39 dirac kernel: usb 1-1.1: Product: GPIB-USB-HS
Feb 06 12:14:39 dirac kernel: usb 1-1.1: Manufacturer: National Instruments
Feb 06 12:14:39 dirac kernel: usb 1-1.1: SerialNumber: 0159084A
Feb 06 12:14:39 dirac kernel: ni_usb_gpib: probe succeeded for path: usb-0000:01:00.0-1.1
Feb 06 12:14:39 dirac kernel: gpib1: exiting autospoll thread
Feb 06 12:14:39 dirac kernel: ni_usb_gpib: attach
Feb 06 12:14:39 dirac kernel: usb 1-1.1: bus 1 dev num 31 attached to gpib minor 1, NI usb interface 0
Feb 06 12:14:39 dirac kernel: product id=0x709b
Feb 06 12:14:39 dirac kernel: ni_usb_hs_wait_for_ready: board serial number is 0x159084a
Feb 06 12:14:51 dirac kernel: /var/lib/dkms/linux-gpib/4.3.5-9.20221109svn2046.fc37/build/drivers/gpib/ni_usb/ni_usb_gpib.c: ni_usb_parse_register_read_block: parse error: wrong start id
Feb 06 12:14:51 dirac kernel: /var/lib/dkms/linux-gpib/4.3.5-9.20221109svn2046.fc37/build/drivers/gpib/ni_usb/ni_usb_gpib.c: ni_usb_parse_register_read_block: parse error: wrong end id
Feb 06 12:14:51 dirac kernel: /var/lib/dkms/linux-gpib/4.3.5-9.20221109svn2046.fc37/build/drivers/gpib/ni_usb/ni_usb_gpib.c: ni_usb_parse_register_read_block: parse error: wrong count=0 for NIUSB_REGISTER_READ_DATA_END
Feb 06 12:14:51 dirac kernel: /var/lib/dkms/linux-gpib/4.3.5-9.20221109svn2046.fc37/build/drivers/gpib/ni_usb/ni_usb_gpib.c: ni_usb_parse_register_read_block: unexpected data: raw_data[6]=0xff, expected 0
Feb 06 12:14:51 dirac kernel: /var/lib/dkms/linux-gpib/4.3.5-9.20221109svn2046.fc37/build/drivers/gpib/ni_usb/ni_usb_gpib.c: ni_usb_parse_register_read_block: unexpected data: raw_data[7]=0xff, expected 0
Feb 06 12:14:51 dirac kernel: ni_usb_dump_raw_block:
Feb 06 12:14:51 dirac kernel:
Feb 06 12:14:51 dirac kernel: 1
Feb 06 12:14:51 dirac kernel: 0
Feb 06 12:14:51 dirac kernel: 78
Feb 06 12:14:51 dirac kernel: 0
Feb 06 12:14:51 dirac kernel: 0
Feb 06 12:14:51 dirac kernel: 0
Feb 06 12:14:51 dirac kernel: ff
Feb 06 12:14:51 dirac kernel: ff
Feb 06 12:14:51 dirac kernel:
I presume it might be a hardware fault (I have only one NI device to test) but it may also be a driver bug. This is using the SVN 2046 pull (very recent).
Although I believed this to be a genuine NI device, (and internally and externally appeared to be so), the NI Windows utility indicates that it is not.
This might explain the problems.
I've got another NI device and this time the NI Windows driver does not cast doubt on its authenticity. I'm certain that this is genuine; however, this device also exhibits this problem.
It does appear that there is some problem in the driver for the GPIB-USB-HS.
Hi,
Please send the output of "lsusb -d 3923:", with the adapter plugged in.
cheers,
-Dave
Here you go...
OK this looks like a "new" version of the adaptor AFAICT.
In the lsusb you sent the four bulk endpoints have a maxPacketSize of 512 which indicates a USB high speed device.
Previous adaptors show a maxPacketSize of 64 for the bulk endpoints which is what the driver uses.
Furthermore the previous adaptors have a bInterval of 0 for the bulk endpoints (which is normal) and a bInterval of 2 for the interrupt endpoint. This "new" device has a bInterval of 2 for the bulk endpoints (which is weird and in anycase ignored) and a bInterval of 4 for the interrupt endpoint.
Is there any way of knowing whether this is a genuine new version of the NI_USB_HS adaptor ?
In both cases the lsusb's shows bcdDevice = 1.01, so one can't tell. Judging from the NI website there does not seem to ever have been be a "new" version.
cheers,
-Dave
I have very high confidence that this is genuine.
As I said, the NI Windows driver does not complain or identify this as suspect. The guy I bought this from says that he purchased it directly from NI.
I looked again at the test program I attached above and noticed that it still had the mitigation I had used to get around bug #81. With that bug now fixed I removed those delays and the problem got much worse.
So the addition of delays just before the async read and async writes helps mitigate the problem.
Its possible that this problem is related to USB comms between the system and the controller and the delays just gives the USB device enough time to process the command.
I modified the test program to have a switch -D to set the delay (default no delay).
e.g:
$ ./GPIBtest -c 0 -d 16 --EOI -D 50
The problem cannot be seen if the delay is set to 50ms (for each read and write) but if set to zero then it will fail after about 50 write IDN / read commands. (this on the Raspberri Pi ... its somewhat less frequent on a faster laptop I tried).
Last edit: Michael Katzmann 2023-04-15
OK... yet more testing.
I replaced the asyncronous calls to
ibrda
andibwrta
with the synchronous callsibrd
andibwrt
and it works without problems (I tried 10000IDN?
writes/reads).If either
ibrda
oribwrta
are used, the driver will fail after some period.This problem is only with the NI USB driver. The same high level C code (using async calls) works flawlessly with the Keysight 82357B.
The NI USB driver seems to have an issue with asynchronous operation (or with the way I'm using it).
(attached updated code with
-y
switch to use synchronous calls (default is async)Michael
Last edit: Michael Katzmann 2023-04-16
Thanks, this does look like a driver problem. Please rebuild the kernel part with GPIB_DEBUG=1, install and run the async test with no delay and send the corresponding syslog output.
-dave
Here are three runs. With the debugging enabled, it failed on the first call (without any delay).
I did one run with a 1ms delay to get some sucessfull calls before a failure.
Run without delay ... (5)
Run with 1ms delay (25 sucessfull calls before failure)
Last edit: Michael Katzmann 2023-04-16
Thanks, it looks like we have two problems here:
First that the initialization sequence is not one known to the driver. This does not prevent it from working. If the adaptor is indeed genuine then we can just add the new sequence response to the driver:
The second problem concerns the failures when using the async calls. These manifest when we get errors on the ni_usb_parse_register_read_block for example:
Their occurrence is not always fatal but they do precede the timeout error. It would be interesting to ascertain whether they also occur when using only synchronous calls.
We no longer need the GPIB_DEBUG=1 make option.
cheers,
-dave
I also have an NI-USB-HS adapter that works on Windows with the official
NI driver but not on Linux with this driver. The problem is not fixed
in revision 2050.
This is the log.
lsusb
This looks like the same clone as evidenced by unexpected data: buffer[6]=0x16, etc in the log. Please send the output of 'lsusb -v -d 3923:' to check. Otherwise this board does seem to work OK with rev 2050 . What is not working for you ?
The is the output of lsusb.
When I run
lsusb -v -d 3923:
as root, it produces the above output,hangs for about 15 seconds, and gives this error. The exit status is 0.
Running it without the
-v
flag doesn't cause the problem.I can't even run ibtest.
This looks like a third variant in the clones we have seen so far based on the lsusb/dmesg reports. The first clone reported in this thread does not work with linux-gpib. This is because it only supports device mode operation where addressing and data are sent together in one usb request. linux-gpib uses board mode where addressing and data are sent separately. This being a fundamental part of the linux-gpib architecture that clone cannot be supported.
The second variant reported at the top of this bug report does seem to work despite the errored response to the ready request. Your board - the third variant has the same errors in the ready-request response but a different lsusb output. It has however the same lsusb output as the first variant.
I presume that your gpib.conf has master=yes set for the board.
Please send the syslog output corresponding to the failed ibtest session.
This is my
gpib.conf
.This is the output of
journalctl
when I ranibtest
. I ran it at 20:40:43 and 20:40:49.The board attach failed due to a timeout on a control message. To progress we would need the USB I/O trace of the board initialization sequence on Windows and linux using Wiresharsk or some other USB monitoring application. From the journalctl output it looks like this is a different clone compared to the others we have seen so far.
Here is the trace.
It looks like the setup of the interrupt URB causes the board to go catatonic.
We can try the following to check this hypothesis:
in ni_usb_gpib.c change line 2298 from
to
~~~
retval = 0;
~~~
rebuild, install, plug in the adapter and send me the resulting dmesg.
OK that was not it. Please send the wireshark trace for the plug-in sequence with the modified ni_usb_gpib.c
The wireshark trace for the plug-in sequence.
Last edit: Carey McKee 2023-04-29
We have reached an impasse - your board does not recognise wValue=0x300 in the vendor control sent to setup the interrupt monitor. The windows driver only ever sends wValue=0x200.
The setup_interrupt_monitor control is used very frequently in the linux-gpib driver so no work-around can be considered.
Thank you for your prompt and pertinent replies.
Sorry I cannot be of any further help.
Last edit: DaveP 2023-04-30
Can you document the problem? For your information, my adapter has a
part number of 187965G-01L and a serial number of 1D6A085, and the code
39 bar code on it shows the serial number. It was purchased from Ebay
for about 100 dollars including shipping. It came with a manual in
several languages and a CD. I can't tell it is a clone.
From the exterior it is not known how to detect a fake. Clones resemble the newer genuine parts.
There is a youtube video that can be helpful to distinguish a clone by the board layout.
Search for "Teardown fake USB-GPIB-HS from china". There do seem to be genuine boards without shielding on the inside of the casing. It is the layout of the chips that is the determining factor.
To detect a fake by software:
In the output of 'sudo lsusb -v -d 3923:'
look for the value of bInterval corresponding to EP 1. If it is zero you have a fake that will not work.
You can also register your serial number with NI and if the corresponding product is not a NI_USB_HS you have a clone but it may work.
I am sorry if what I said was not clear. Can you write about the problem about clones in the supported hardware section of the manual for this piece of software?