Thread: Raw reads from TI XIO2200(A) fail between 1800 and 2048 with RCODE 17
Brought to you by:
aeb,
bencollins
From: Mike A. <mik...@gm...> - 2011-08-01 20:16:35
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi there, I've been using libforensic1394 to carry our raw reads of windows memory using firewire from a linux system. The setup is a linux box with a built-in firewire adaptor: 03:00.4 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd Device [1180:e832] (rev 03) And the windows box has had an expresscard inserted to ensure firewire support. The card used shows up as follows under linux: 05:00.0 PCI bridge [0604]: Texas Instruments XIO2000(A)/XIO2200(A) PCI Express-to-PCI Bridge [104c:8231] (rev 03) 06:00.0 FireWire (IEEE 1394) [0c00]: Texas Instruments XIO2200(A) IEEE-1394a-2000 Controller (PHY/Link) [104c:8235] (rev 01) When running the code, the request_size is returned as 2048 (as expected), but when carrying out reads of various sizes, the results are consistent and valid up until 1800 bytes (exactly) and then 1801 - 1805 sometimes work and sometimes don't, and beyond that they almost always fail. Above 2048, the error returned is a request_size too big error, but between 1800 and 2048, the RCODE that I've tracked it back to is 17 (which appears to be RCODE_CANCELLED). I've tested this on 2.6.35, 2.6.36, 2.6.39 and 3.0, and they all fail in the same way. The author of libforensic1394 therefore suggested that this might be a kernel issue and that I should post my findings here. I also tried large bursts of reads below 1800 bytes to see whether too many asynchronous reads were causing the problem, but they all appeared to work fine. I've included the relevant lines from an strace of a small python test program (also included), and I'd be happy to run any additional tests, or answer any questions that I can in order to get this issue resolved... Thanks, Mike 5:) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iEYEARECAAYFAk43BXoACgkQu7rWomwgFXoGmQCggeeB9KQkOubD25pL/4ihrsn1 MEsAoLPUK1q4Qj8OCms+TdYBCfy94iP2 =Ca/u -----END PGP SIGNATURE----- |
From: Clemens L. <cl...@la...> - 2011-08-02 16:33:06
|
Mike Auty wrote: > I've been using libforensic1394 to carry our raw reads of windows memory > using firewire from a linux system. The setup is a linux box with a > built-in firewire adaptor: > > 03:00.4 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd Device [1180:e832] (rev 03) > > And the windows box has had an expresscard inserted to ensure firewire > support. The card used shows up as follows under linux: > > 05:00.0 PCI bridge [0604]: Texas Instruments XIO2000(A)/XIO2200(A) PCI Express-to-PCI Bridge [104c:8231] (rev 03) > 06:00.0 FireWire (IEEE 1394) [0c00]: Texas Instruments XIO2200(A) IEEE-1394a-2000 Controller (PHY/Link) [104c:8235] (rev 01) > > When running the code, the request_size is returned as 2048 (as > expected), but when carrying out reads of various sizes, the results are > consistent and valid up until 1800 bytes (exactly) and then 1801 - 1805 > sometimes work and sometimes don't, and beyond that they almost always > fail. Above 2048, the error returned is a request_size too big error, > but between 1800 and 2048, the RCODE that I've tracked it back to is 17 > (which appears to be RCODE_CANCELLED). This means that there was no response (or the response was not recognized). I guess it's slower because of a timeout (100 ms or 2 s, depending on kernel version)? > I've tested this on 2.6.35, 2.6.36, 2.6.39 and 3.0, and they all fail > in the same way. The author of libforensic1394 therefore suggested > that this might be a kernel issue There were big changes, and big packets should work fine. This looks more like a hardware issue. This 1800 bytes limit is suspiciously near the default 1.7 KB AT FIFO DMA threshold of the XIO2200A. Apparently, the XIO2200A cannot read the packet data from memory fast enough, so the packet gets corrupted. In theory, this should result in the packet being retried when it has been completely read into the FIFO, but this does not seem to happen in your case. When I did stress testing my XIO2200A with a VT6308, this worked correctly at all packet sizes, but this was with explicit AT transmit requests/responses and not physical DMA reads. I see two possible problem sources: 1) the XIO2200A does not retry physical read response packets; or 2) the R5C832 does not correctly signal corrupted packets back to the sending node. Ricoh chips are considered cheap crap, so I'd be more suspicious about that one. However, the problem might be avoided with a controller that does not have as agressive buffering as the XIO2200A, so you might try to replace either one. Regards, Clemens |
From: Mike A. <mik...@gm...> - 2011-08-02 21:09:13
|
On 02/08/11 17:36, Clemens Ladisch wrote: > There were big changes, and big packets should work fine. This looks > more like a hardware issue. > > This 1800 bytes limit is suspiciously near the default 1.7 KB AT FIFO > DMA threshold of the XIO2200A. Apparently, the XIO2200A cannot read the > packet data from memory fast enough, so the packet gets corrupted. In > theory, this should result in the packet being retried when it has been > completely read into the FIFO, but this does not seem to happen in your > case. > > When I did stress testing my XIO2200A with a VT6308, this worked > correctly at all packet sizes, but this was with explicit AT transmit > requests/responses and not physical DMA reads. > > I see two possible problem sources: > 1) the XIO2200A does not retry physical read response packets; or > 2) the R5C832 does not correctly signal corrupted packets back to the > sending node. > > Ricoh chips are considered cheap crap, so I'd be more suspicious about > that one. However, the problem might be avoided with a controller that > does not have as agressive buffering as the XIO2200A, so you might try > to replace either one. Ok, thanks very much. I've got two of the XIO2200A's, but no cable for them, so that seems like the first step to test. I'll try to report back my findings, but it may take me a couple of weeks due to time constraints. Thanks again for you quick reply (and testing out the code to check that it works with similar hardware)! Mike 5:) |