From: DG <dg...@um...> - 2000-10-12 17:14:46
|
From: pa...@rc... (David Paschal) > Hi, Daniel. Thanks for getting me the debug output. Unfortunately there > was a gap right at the place where it became interesting (where it changes > from event9/get_nibble/event11/event6 to mlcpp_intr). If you happened to > save other debug logs (or if you try this again), could you check to see if > any of the other logs show this transition better? If you see a line such > as "get_nibble(c60b_intr c60b1160", that is a sign of a gap. I also collected debug output setting LINEBUF on. This seemed to fix those problems, but also seemed to lose some mlc output. I'll send that tonight. It looks very similar. > > It occurred to me that after we set event 7 or 12 (nAutoFd=0) in get_nibble, > perhaps the peripheral doesn't respond with event 9 (nAck=0) until after we > drop out of the poll loop contained within the PAR_WAIT_SET_CLEAR macro. > I don't know how long it is until mlcpp_intr gets called again to retry the > poll loop, but perhaps the peripheral times out in the meantime and backs > out of event 9, such that we time out for event 9. To test this theory, > please try increasing the value of SHORT_WAIT, which is currently 500 > iterations of the loop in PAR_WAIT_SET_CLEAR. I'm not sure what would be a > better value, but you could experiment with several, such as 5000 or 10000. > If it's too large you may start to notice that your machine gets sluggish > due to spending a lot of time in kernel mode. This was the first thing I tried, although I didn't go to 5000 or 10000. I think I just tried 1500 (I multiplied all the timeout constants by 3). I'll try again with higher values and see what I come up with. > Another interesting test would be to log the return code of parStatusRead() > right before setting event 7/12, and again after "SET_STATE(l,9,BUSY_TIME);" > but before "PAR_WAIT_SET_CLEAR(l,0,PAR_STATUS_NACK);". Any status line > changes after event 7/12 might further suggest a peripheral-side timeout. I'll try it. -- Daniel ZZZ-dgun-ZZZ-@-ZZZ-umpire.com-ZZZ (Remove the Z-'s to reply) ______________________________________________ FREE Personalized Email at Mail.com Sign up at http://www.mail.com/?sr=signup |
From: DG <dg...@um...> - 2000-10-17 05:37:48
|
From: pa...@rc... (David Paschal) > Hi, Daniel. Thanks for getting me the debug output. Unfortunately there > was a gap right at the place where it became interesting (where it changes > from event9/get_nibble/event11/event6 to mlcpp_intr). If you happened to > save other debug logs (or if you try this again), could you check to see if > any of the other logs show this transition better? If you see a line such > as "get_nibble(c60b_intr c60b1160", that is a sign of a gap. Unfortunately, all of the log files I've made have this problem, and setting LINEBUF didn't seem to help. > It occurred to me that after we set event 7 or 12 (nAutoFd=0) in get_nibble, > perhaps the peripheral doesn't respond with event 9 (nAck=0) until after we > drop out of the poll loop contained within the PAR_WAIT_SET_CLEAR macro. > I don't know how long it is until mlcpp_intr gets called again to retry the > poll loop, but perhaps the peripheral times out in the meantime and backs > out of event 9, such that we time out for event 9. To test this theory, > please try increasing the value of SHORT_WAIT, which is currently 500 > iterations of the loop in PAR_WAIT_SET_CLEAR. I'm not sure what would be a > better value, but you could experiment with several, such as 5000 or 10000. > If it's too large you may start to notice that your machine gets sluggish > due to spending a lot of time in kernel mode. Very sluggish, but still hangs. And the debug output still gets cut off occasionally. I'm not sure how to fix that. > Another interesting test would be to log the return code of parStatusRead() > right before setting event 7/12, and again after "SET_STATE(l,9,BUSY_TIME);" > but before "PAR_WAIT_SET_CLEAR(l,0,PAR_STATUS_NACK);". Any status line > changes after event 7/12 might further suggest a peripheral-side timeout. Event 7/12 is a parControlSetClear() which never calls parStatusRead(). Do you mean parControlRead() or do you mean check it during event 6 (which comes just before event 7, I presume)? It looks like parStatusRead() gets called in PAR_WAIT_SET_CLEAR. Does the 1284.4 specification outline all these events? It would be very helpful to have a flowchart or state diagram showing how all these events are supposed to occur. -- Daniel ZZZ-dgun-ZZZ-@-ZZZ-umpire.com-ZZZ (Remove the Z-'s to reply) ______________________________________________ FREE Personalized Email at Mail.com Sign up at http://www.mail.com/?sr=signup |
From: <pa...@rc...> - 2000-10-17 06:34:55
|
> Unfortunately, all of the log files I've made have this problem, and setting > LINEBUF didn't seem to help. > Very sluggish, but still hangs. And the debug output still gets cut off > occasionally. I'm not sure how to fix that. I'm not either. It's a matter of syslogd racing to keep up with the log messages spewing out of the kernel. I added the LINEBUF facility as a quick-and-dirty workaround to help me debug a specific problem in preparation for releasing version 0.5. When enabled, it gave me a rough trace of the sequence certain lines in the code got executed, and the linebufDump() function would print out the results at a convenient time. It might be useful here with some changes, perhaps to "log" status line values instead of line numbers in the source code (or perhaps some bit-shifted combination thereof. > > Another interesting test would be to log the return code of > parStatusRead() > > right before setting event 7/12, and again after > "SET_STATE(l,9,BUSY_TIME);" > > but before "PAR_WAIT_SET_CLEAR(l,0,PAR_STATUS_NACK);". Any status line > > changes after event 7/12 might further suggest a peripheral-side timeout. > > Event 7/12 is a parControlSetClear() which never calls parStatusRead(). Do > you mean parControlRead() or do you mean check it during event 6 (which > comes just before event 7, I presume)? It looks like parStatusRead() gets > called in PAR_WAIT_SET_CLEAR. I meant to add an additional parStatusRead() (and printk the return value) right before the parControlSetClear() for event 7/12, and another one after SET_STATE but before PAR_WAIT_SET_CLEAR, just to see what the status lines are showing. > Does the 1284.4 specification outline all these events? It would be very > helpful to have a flowchart or state diagram showing how all these events > are supposed to occur. IEEE 1284-1994 (or 1284 for short) is the "Standard Signaling Method for a Bidirectional Parallel Peripheral Interface for Personal Computers", which describes these events in great detail. 1284.4 and MLC are higher-level transport protocols which can theoretically operate over any bidirectional link (such as parallel, USB, etc.). David |
From: DG <dg...@um...> - 2000-10-18 04:18:12
|
From: pa...@rc... (David Paschal) >I meant to add an additional parStatusRead() (and printk the return value) >right before the parControlSetClear() for event 7/12, and another one after >SET_STATE but before PAR_WAIT_SET_CLEAR, just to see what the status lines >are showing. OK. Here's some debug output right before the timeout. If you want the entire debug output, let me know. This one didn't even get past the initial negotiation: Oct 17 21:14:02 localhost kernel: get_nibble(c60b11c0): nibble=0x5. Oct 17 21:14:02 localhost kernel: c60b11c0: event=11. Oct 17 21:14:02 localhost kernel: c60b11c0: event=6. Oct 17 21:14:02 localhost kernel: status 1: 6f Oct 17 21:14:02 localhost kernel: c60b11c0: event=9. Oct 17 21:14:02 localhost kernel: status 2: 2f Oct 17 21:14:02 localhost kernel: get_nibble(c60b11c0): nibble=0x5. Oct 17 21:14:02 localhost kernel: c60b11c0: event=11. Oct 17 21:14:02 localhost kernel: c60b11c0: event=6. Oct 17 21:14:02 localhost kernel: status 1: d7 Oct 17 21:14:02 localhost kernel: c60b11c0: event=9. Oct 17 21:14:02 localhost kernel: status 2: 1f Oct 17 21:14:02 localhost kernel: get_nibble(c60b11c0): nibble=0x3. Oct 17 21:14:02 localhost kernel: c60b11c0: event=11. Oct 17 21:14:02 localhost kernel: c60b11c0: event=6. Oct 17 21:14:02 localhost kernel: status 1: 5f Oct 17 21:14:02 localhost kernel: c60b11c0: event=9. Oct 17 21:14:02 localhost kernel: status 2: 2f Oct 17 21:14:02 localhost kernel: get_nibble(c60b11c0): ni4> state=5 event=9 reset=7 recv=1 rcv=0 send=2 Oct 17 21:14:02 localhost kernel: status 2: 6f Oct 17 21:14:02 localhost kernel: mlcpp0: mlcpp_intr c60b11c0 Oct 17 21:14:02 localhost kernel: state=5 event=9 reset=7 recv=1 rcv=0 send=2 Oct 17 21:14:02 localhost kernel: status 2: 6f Oct 17 21:14:02 localhost kernel: mlcpp0: mlcpp_intr c60b11c0 Oct 17 21:14:02 localhost kernel: state=5 event=9 reset=7 recv=1 rcv=0 send=2 ...this gets repeated many times until the timeout below... Oct 17 21:14:07 localhost kernel: status 2: 6f Oct 17 21:14:07 localhost kernel: PAR_WAIT_SET_CLEAR(l=c60b11c0,set=0x0000,clear=0x0040): timed out waiting for event=9! -- Daniel ZZZ-dgun-ZZZ-@-ZZZ-umpire.com-ZZZ (Remove the Z-'s to reply) ______________________________________________ FREE Personalized Email at Mail.com Sign up at http://www.mail.com/?sr=signup |
From: <pa...@rc...> - 2000-10-18 08:07:43
Attachments:
ieee12844pp-patch-dg.txt
|
dgun said: > OK. Here's some debug output right before the timeout. If you want > the entire debug output, let me know. This one didn't even get past > the initial negotiation: ... > Oct 17 21:14:02 localhost kernel: status 1: 5f > Oct 17 21:14:02 localhost kernel: c60b11c0: event=9. > Oct 17 21:14:02 localhost kernel: status 2: 2f > Oct 17 21:14:02 localhost kernel: get_nibble(c60b11c0): ni4> state=5 > event=9 reset=7 recv=1 rcv=0 send=2 > Oct 17 21:14:02 localhost kernel: status 2: 6f > Oct 17 21:14:02 localhost kernel: mlcpp0: mlcpp_intr c60b11c0 > Oct 17 21:14:02 localhost kernel: state=5 event=9 reset=7 recv=1 rcv=0 > send=2 Hi, Daniel. Unfortunately, it still got corrupted here, which might explain why it appeared to not get past the initial negotiation. Please try the attached patch against a clean copy of ieee12844pp.c (version 0.6). I yanked out all but the most interesting log messages in hopes of reducing the gaps. However, I can't seem to actually reproduce the problem (on an R80, which is of similar vintage to the PSC500), where I'm running "ptal-connect mlc:mlcpp0 -scan" and typing <esc>E and <esc>*f0S. Perhaps there is a slight timing difference between my computer and yours, and/or between the R80 and PSC500, which is causing this problem for you but not for me. I'll try to experiment with a PSC500 at work sometime soon. David |
From: PASCHAL,DAVID (HP-Roseville,ex1) <dav...@hp...> - 2000-10-19 01:21:11
|
I wrote: > Perhaps > there is a slight timing difference between my computer and > yours, and/or > between the R80 and PSC500, which is causing this problem for > you but not > for me. I'll try to experiment with a PSC500 at work sometime soon. I just tried the PSC500 at work, and unfortunately I can't reproduce the problem with that either! :-( David |
From: DG <dg...@um...> - 2000-10-19 14:24:56
|
From: "PASCHAL,DAVID (HP-Roseville,ex1)" <dav...@hp...> >I wrote: >> Perhaps >> there is a slight timing difference between my computer and >> yours, and/or >> between the R80 and PSC500, which is causing this problem for >> you but not >> for me. I'll try to experiment with a PSC500 at work sometime soon. > >I just tried the PSC500 at work, and unfortunately I can't reproduce the >problem with that either! :-( I'll have to try it on my 486 and see what happens. -- Daniel ZZZ-dgun-ZZZ-@-ZZZ-umpire.com-ZZZ (Remove the Z-'s to reply) ______________________________________________ FREE Personalized Email at Mail.com Sign up at http://www.mail.com/?sr=signup |