sc6551.dr problems

Developers
2010-02-25
2013-05-16
  • Gene Heskett
    Gene Heskett
    2010-02-25

    The driver seems to have, sometime in the last 14 years, developed a show stopper when hardware flow controils are being used.  So I'll paste a prepared text here.

    sc6551.dr has a problem as built for level 2, 6309 nitros9, V3.2.9
    It hangs during file transfers that are fast enough to need flow
    control, in this case via hardware, a std null modem cable is in
    use.

    Xmode par=02, bau=06, xtp=81, xon=00, xof=00

    in this example.  However it may work for 200+ kilobytes before
    it does this hang, and when its hung, I have to reboot to a disk
    I made that contains the SACIA and T2.dd from a 14 year old boot
    disk in order to reliably recover the port, the current driver
    will only recover about 1 in 10 times, some of which were even
    power down reboots.  I find this very odd.  That 14 year old
    SACIA driver also, while not hanging ever, suffers from the
    2nd problem listed below this bit of decoding these dmem dumps.

    Initial shell launched against /t2, no traffic yet

    Address   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0 2 4 6 8 A C E
    ----- --- --- --- --- --- --- --- ---  -----------
    00000000 00FF 6808 0000 0016 0000 000A 0C17 0000  ..h………….
    00000010 0000 0000 0000 5340 0000 0000 0010 0040  ……S@…….@
    00000020 2000 0000 0047 0046 0000 B000 0A46 0046   ….G.F..0..F.F
    00000030 0001 0000 0000 080B 0002 0681 0000 0000  …………….

    After a zmodem disk image move has hung
    Address   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0 2 4 6 8 A C E
    ----- --- --- --- --- --- --- --- ---  -----------
    00000000 00FF 6803 0300 0000 0047 0000 0000 0000  ..h……G……
    00000010 0000 0000 0000 5340 0000 0000 0018 0040  ……S@…….@
    00000020 2000 0001 0047 0046 7000 B000 0A46 0046   ….G.Fp.0..F.F
    00000030 2701 0000 B700 0001 2102 0681 0000 0000  '…7…!…….

    Decoded         offset  initial         hung
    varname         hex     value           value

    Cpy.Stat        $1D     $10             $18
    CpyDCDSR        $1E     $00             $00
    Mask.DCD        $1F     $40             $40
    MaskDSR         $20     $20             $20
    CDSig.PID       $21     $00             $00
    CDSisSig        $22     $00             $00
    FloCtlRx        $23     $00             $01
    FloCtlTx        $24     $00             $00
    RxBufEnd        $25-26  $4700           $4700
    RxBufGet        $27-28  $4600           $4670
    RxBufMax        $29-2A  $00B0           $00B0
    RxBufMin        $2B-2C  $000A           $000A
    RxBufPtr        $2D-2E  $4600           $4600
    RxBufPut        $2F-30  $4600           $4627
    RxBufSiz        $31-32  $0100           $0100
    RxDatLen        $33-34  $0000           $00B7
    SigSent         $35     $00             $00
    SSigPID         $36     $08             $00
    SSigSig         $37     $0B             $01
    WritFlag        $38     $00             $21
    WRK.TYPE        $39     $02             $02
    WRK.BAUD        $3A     $06             $06
    WRK.XTYP        $3B     $81             $81

    There are actually 2 problems, but the 2nd one isn't a show stopper.

    This 2nd problem occurs because when using this method of flow control,
    the code does not check for TxBE=1 before killing the chips tx, which will
    cut it off in mid byte if one is in the process of being sent.  Ooops. Drain
    bamaged darned chip anyway! (as said in polite company)

    This delay would allow the rx to possibly rx another byte, but with the
    turnoff point "ixoffpt" at a remaining buffer safety of 80 bytes is not a
    problem, and which could be reduced to $20 (32) without problems.  The
    behaviour of a 16550 based pc port, at least on this machine, is to allow
    the 16550's xmit buffer to drain, so I commonly see ixoffpt overrruns of up
    to 15 bytes here.

    If we shut the tx off in mid byte, then sz won't get the ACK, and has to
    back up and restart the packet, but rz isn't aware yet, so it has to stop sz
    again and request a 2nd retransmission restart so they get in synch again.
    This happens often enough to drop the average transfer speed shown by
    approximately 120 cps from the coco's theoretical maximum, in mine (6309)
    that is about 730-740 cps.  6809 machines are someplace in the 500's.

    The 2nd problem I can probably fix given more familiarity with this code.
    Thanks, Gene Heskett.

     
  • Gene Heskett
    Gene Heskett
    2010-02-25

    And the )*&%^ html based editor is gobbling up the tabs I used for formatting.  Its not going to be easy, but put a tab or tabs in front of any dollar signs above.

    Similarly the labels should be
    Decoded                    offset                  initial                       hung
    varname                    hex                     value                        value

    etc etc.  Damn I hate html, WYSINWYG!
    Cpy.Stat                    $1D                   $10                            $18
    CpyDCDSR                $1E                   $00                            $00
    Mask.DCD                 $1F                   $40                            $40
    MaskDSR                  $20                   $20                            $20
    CDSig.PID                 $21                   $00                            $00
    CDSisSig                   $22                   $00                            $00
    FloCtlRx                    $23                   $00                            $01
    FloCtlTx                    $24                   $00                            $00
    RxBufEnd                 $25-26              $4700                        $4700
    RxBufGet                  $27-28              $4600                        $4670
    RxBufMax                 $29-2A             $00B0                         $00B0
    RxBufMin                  $2B-2C             $000A                        $000A
    RxBufPtr                   $2D-2E             $4600                         $4600
    RxBufPut                   $2F-30             $4600                         $4627
    RxBufSiz                   $31-32              $0100                         $0100
    RxDatLen                  $33-34              $0000                         $00B7
    SigSent                     $35                   $00                              $00
    SSigPID                    $36                   $08                              $00
    SSigSig                     $37                   $0B                              $01
    WritFlag                    $38                   $00                             $21
    WRK.TYPE                 $39                   $02                             $02
    WRK.BAUD                $3A                   $06                             $06
    WRK.XTYP                 $3B                   $81                             $81

    There, I tried to stack it correctly using only spaces, lets see the editor gobble these

     
  • Gene Heskett
    Gene Heskett
    2010-02-25

    and it did, somebody tell me how to send formatted text thru this son of a bitch, please!

     
  • Gene Heskett
    Gene Heskett
    2010-02-25

    My apologies, the emailed echo's do retain the formatting, grrrrr

     
  • Gene Heskett
    Gene Heskett
    2010-02-26

    I may have a clue about this.
    1.  In this code, V.Wake at offset $05 in is always $00. It is supposed to be a copy of the MSByte of >D.Proc I think.
    This varies, with the current hang, D.Proc is 4700 according to dmem.
    2. The rx buffer is not being drained, but is left at a value a few bytes past the iXoffpt.
    3. A wakeup signal has been sent to an unknown process here according to SSigSig at $37
    4.  But when its hung, SSigPID at offset $36 is $00
    5. Before rz was started, offset $36 was the process number of the shell running against /t2, eg $08 for instance if started from the startup script, or if started by hand from the console, typically $03 or if a restart, $04.

    Shouldn't this SSigPID of the shell be replaced by the PID of rz when it is started by that shell?

    As its hung right now 'procs' reports:
    # cat CoCoFile-9

             User                     Mem Stack
    Id  PId Number  Pty Age Sts Signl Siz  Ptr   Primary Module
    -- -- ----- -- -- -- --- -- --- -----------
      2   1     0   128 131 $80    0   31 $5DDE Shell
      3   8     0   128 128 $A0    0   48 $5AAF rz
      4   2     0   128 129 $A0    0   31 $1EF1 Procs
      5   0     0   128 131 $80    0   31 $56DE Shell
      6   0     0   128 132 $80    0   31 $50DE Shell
      7   0     0   128 131 $80    0   31 $4CDE Shell
      8   0     0   128 128 $80    0   31 $49DE Shell

    and 'proc' reports:
    # cat CoCoFile-10

    ID Prnt User Pty  Age  Tsk  Status  Signal   Module    I/O Paths
    ___ ____ ____ ___  ___  ___  _______ __  __  _________ __________________
      1   0    0  255  255   00  sTimOut  0  00  System    <Term >Term >>Term
      2   1    0  128  131   00  s        0  00  Shell     <Term >Term >>Term
      3   8    0  128  128   00  sTimOut  0  00  rz        <t2   >t2   >>t2
      4   2    0  128  129   02  s        0  00  Proc      <Term >p    >>Term
      5   0    0  128  131   00  s        0  00  Shell     <W4   >W4   >>W4
      6   0    0  128  132   00  s        0  00  Shell     <W1   >W1   >>W1
      7   0    0  128  131   00  s        0  00  Shell     <W2   >W2   >>W2
      8   0    0  128  128   00  s        0  00  Shell     <t2   >t2   >>t2

    So rz is still running, why hasn't it drained RxDatLen at offset $33-34 to $0000?  There ae still $f7 bytes to be read:
    # cat CoCoFile-11

    Address   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0 2 4 6 8 A C E
    ----- --- --- --- --- --- --- --- ---  -----------
    00000000 00FF 6803 0300 0000 0047 0000 0000 0000  ..h……G……
    00000010 0000 0000 0000 5340 0000 0000 0008 0040  ……S@…….@
    00000020 2000 0001 0047 0046 6700 EC00 0A46 0046   ….G.Fg.l..F.F
    00000030 5E01 0000 F700 0001 8902 0681 0000 0000  ^…w………..
    that is the $00F7 in the bottom row, and WritFlag at $38 has grown, I assume because rz is attempting to restart
    the flow, something it cannot do with the transmitter of the 6551 shut down as it is right now.

    It actually got to $00F7 because the setpoint for shutoff was raised to only a 20 byte cushion in my last build, and has 9 bytes left.  The 16550 equ on this mobo allows the tx buffer to drain after CTS drops, potentially another 16 bytes
    after the coco says stop.

    Am I on the right track with the SSigPID being zeroed question?  It sure doesn't look right to me.

    Thanks.