Menu

Drivewire driver v-port bug

Developers
2014-05-26
2014-06-30
  • Bill Pierce

    Bill Pierce - 2014-05-26

    I am posting this here as the "Bug Report" section is not giving me an option to post for some reason.

    I have found that there is a bug of sorts somewhere between the the dw v-port driver, and SCF. I discovered this while trying to write the dw4 parser for my MShell project. The routine I was working on would eventually allow one to read a directory from the host DW4 server's hard drive and parse the directory to MSell's database allowing the user to then select and copy files from the host PC to the Coco OS9 file system. In trying to do this, I get errors reading the last packet sent by drivewire. After setting up error traps to see exactly what was going on, I found this:

    It seems that the dw4 server sends an EOT after the last packet is sent to OS9's buffer. OS9 receives the buffer contents, then seeing the EOT, responds by closing the port and clearing the buffer. It does so BEFORE the software has a chance to read the buffer to get it's contents.

    It seems to only happen if there's a pause on the caller's end to process the preceding data therefore not actually making the "read" call for the next buffer immediately. OS9 hits an interrupt, sees the buffer empty from the last read, get's the next (last) packet and then seeing the EOT, checks to see if there is a read in progress (program is in a process loop with previous packet), if not, it assumes the program is done and closes the buffer and clears it. The program then comes back from it's loop to get the last packet and finds the port closed and buffer empty and returns an error.

    The ultimate loop would read the data continuously and therefore always show a read state, but when large amounts of data are being returned, this is not always possible. File transfers and FTP downloads are a good example of this. When this data is directly redirected to a disk file, OS9's interrupts are blocked so this doesn't happen. But if the program is storing the data in a buffer or, in my case, virtual memory, then the caller needs time to process each packet as the Coco's memory constraints will not allow for large files to be transferred all at once.

    I discussed this with Aaron and confirmed that it is most likely a bug in communication between "scdwv" and "SCF". I do not know the code well enough to find where the problem is, but I do know there needs to be some way for the driver to know the buffer has not been read and to not close the port just because it has received the EOT. I guess one method would be just to pass the EOT to the caller and let the caller shut down the port when it's done or if the program terminates. The other could possibly be a flag for if the buffer has been read, and not close untill the flag is set. I just know I'm getting "port closed" errors on the last packet and data transfers to virtual memory are impossible at the moment.

    Any ideas?

     
  • Aaron Wolfe

    Aaron Wolfe - 2014-05-27

    To be clear, the problem happens when the server notifies the driver that a port is closed (such as when a remote TCP host has disconnected, or when a command has completed).

    At this point, writing to the port should generate an error. However, since the os9 driver can buffer 200+ bytes of input, it is possible for there to be data still unread even though the connection is long gone.

    This doesn't occur in programs that consume the buffer within one cycle of the software interrupt that triggers a SERPOLL (usually a few hundred ms) because the announcement of a closed port is always at least one serpoll response after the last poll that adds data to the buffer. I believe this is the only reason we haven't had trouble since the beginning. The design doesn't handle the scenario of data remaining in the buffer at all.

    Do any OS9 experts have examples or advice on how to deal with "closed but readable" SCF devices? If there is a proper way to do it we should follow.

     
  • Bill Pierce

    Bill Pierce - 2014-05-28

    Thanks for clarifying Aaron, as an example, here is what my software is doing

    The program opens an "/Nx" port
    then dispatches a dw cmd sequence
    once the cmd is verified as valid, the program goes into an endless loop

    <loop>
    check for data ready or EOT
    read dw buffer
    process received data
    <end loop="">

    Now when the last packet is sent by the server while my program is busy with data processing..
    OS9 hits an interrupt loop. OS9 then sees I have read the previous buffer and prereads the last buffer before my program makes the last read call. In the process of filling the last buffer, OS9 receives the EOT (signal 4). On receiving the EOT, a "condem" flag is set, SCF then sees the flag, and checks if the port is busy, my program is in a processing loop so it's not reading at the moment, SCF seeing the port not busy, closes the port.

    Then my program comes back from processing it's data to find the port closed and the last data packet unavailable.

    I have set up error traps at each stage to check at what point the port gets closed and it's definately closing before my program trys to read the last packet. The DW4 logs show the packet as "sent successfully" and EOT sent. This has been checked on dw cmd sequences in which I know the packet count and the returned data is known.

    The data processing loop is pretty fast and stores the buffer content to a memory array, so there's not a big time delay when this happens. I am reading the full buffer contents on each read, therefore, after each read, the buffer is left empty.

    Any idea how to keep scdwv and scf from closing the port?

    This is hampering FTP file downloads, raw file reads from the server, and various dw4 staus listings. This does not have anything to do with "rbdw" and v-disk operations and is only pertaining to the v=ports (/Nx). If the program could do a continious read until closing, this would not happen. It also doesn't happen when I write each received buffer to a disk file. I assume this is because the disk routines are blocking OS9's interrupts, so my program gets back before the interrupt and EOT occurs. Ultimately, a continious read wold be ideal, but when reading a 200k file from an FTP, a 64k porocess has to do some shuffling to to deal with that much data.

    This kind of behavour will hamper any kind of software developement for file transfers between machines or even file downloads through telnet BBSs.

     
  • Aaron Wolfe

    Aaron Wolfe - 2014-06-30

    This issue is actually resolved, I just forgot that we already fixed it.

    To deal with the port closing vs buffer read issue:

    1: Have your process install a signal handler that intercepts S$HUP

    2: In your read routine, do not consider the job done until A, your process has received S$HUP and B, you get a 0 response from getstat.ssready

    The signal S$HUP is sent to any process with an open handle to a port when the server notifies OS9 that port has been closed. This does not mean you cannot continue to read from the local buffer, however it does mean that without a signal handler in place to interpret this signal, your process will be condemned and likely die an untimely death.

    By catching the S$HUP signal, your process will not be condemned and can instead use this as a notification that the port has been closed, but continue to read until the buffer is empty (i.e. ssready returns 0).

    An example of this can be found in level1/cmds/dw.as

     

Anonymous
Anonymous

Add attachments
Cancel