Share

The Linux 8250/16550 Serial Driver

Tracker: Bugs

5 serial.c chokes on big, fast UARTs - ID: 408837
Last Update: Comment added ( studog )

The serial driver does not properly handle high speed
UARTs with very big buffers. This is important if you
have for example a PCMCIA Ricochet modem.

This is because the driver in serial.c reads characters
out of the UART before checking the TTY FLIPBUF
structure for free space. If there is no space, the
driver "punts" and tosses the character. The driver
ought to check for free space first. History buffs:
this problem was introduced by at least linux-0.99.4
(January 1992?) and seems to apply to every version
since then.

A patch for serial 5.05 is attached. This patch will
also work okay on linux-2.4. More information and Linux
kernel specific patches can be found at
http://home.nyu.edu/~gmp216/nrm6842/bigfastuart.html.


Greg Pomerantz ( gmp ) - 2001-03-15 17:32

5

Open

None

Nobody/Anonymous

None

None

Public


Comments ( 7 )




Date: 2001-07-04 22:19
Sender: studog

Logged In: YES
user_id=41654

Assume a 450, 1 byte "fifo". Leaving the char unread is
100% data loss. (I assume a constant inflow.)
Assume a 650, 64 byte fifo, rx trigger at 60. At 9600 (8n1)
you get one new char about every 1 ms, leaving 4 ms to
get some tty space. At 115200, you get a new char every
50 us, leaving .2 ms.

Granted you might not use a trigger of 60.

Worst case, the risk of data loss of leaving the char unread
in the hardware is quite high. Not necessarily 100%, but
high enough. The driver is written to guard against worst
case performance. I think. Ted wrote it, but that's my
interpretation.

Show me a uart that has a 4096 (PAGE_SIZE) fifo. Actually,
I believe the tty layer has two of these to fill up.

...Stu


Date: 2001-07-04 22:07
Sender: gmp

Logged In: YES
user_id=174545

Stu, leaving the characters unread in the uart is the only
possible option in a situation like this. Currently, the
driver chooses to always lose a byte rather than run the
risk (however small) that the hardware will lose a byte.
The current policy prefers a 100% likelihood of data loss
to an unknown risk of data loss. As a result, perfectly
good serial devices that operate entirely within spec are
unusable. As the driver is written, it can never work
properly when the serial device has a larger buffer than
the tty the driver is bound to.

p.s. Just because Metricom went chapter 11 is no excuse not
to put this patch in :)



Date: 2001-07-04 16:11
Sender: studog

Logged In: YES
user_id=41654

Checking for space before will leave the chars unread in
the uart, causing hardware overruns eventually. The serial
driver is written to prevent hardware errors if at all
possible, which is why we read and then check.


Date: 2001-07-04 16:08
Sender: studog

Logged In: YES
user_id=41654

Checking for space before will leave the chars unread in
the uart, causing hardware overruns eventually. The serial
driver is written to prevent hardware errors if at all
possible, which is why we read and then check.


Date: 2001-06-29 16:31
Sender: albion

Logged In: YES
user_id=106291

The patches help on a Toshiba Tecra 500CDT, but there still seems to be
lost data. It's an older laptop and
probably slower. I'm running Debian with kernel 2.2.19pre17.



Date: 2001-04-28 02:32
Sender: gmp

Logged In: YES
user_id=174545

Ian, I have some comments on your suggested patch. The
executive summary is that I think both patches are good but
imperfect. Unfortunately, doing this properly may require
some redesign of the tty interface, or maybe the
introduction of some other interface for high speed serial
devices. Also, the following discussion is academic, as I
haven't actually attempted to hit any of the boundary
conditions. I have run your patch, and it appears to work
just as well as mine for my purposes.

Your patch has the definite advantage of reducing system
load in the case where the serial port is getting
continually bombarded. The problem with your suggestion is
that, even on an unloaded system, there is a possible
boundary condition where the interrupt is called while
TTY_DONT_FLIP is set. That will always result in lost data,
even if the UART buffer was not in danger of overflowing and
characters could have been left there longer. Whereas my
scheme may result in interrupt thrashing and very high
system load, no bytes are lost. Your scheme is clearly
better when the flip buffer is full because the system has
not had time to clear it yet. With my patch, the increased
system load will exacerbate the problem. I think my scheme
is better with a bursty interface that in the long run the
system is able to keep up with. Also, my system is better
when the unflipability has nothing to do with system load
and is just an unhappy coincidence (i.e. TTY_DONT_FLIP is
briefly set).

One option is to mask the interrupt when TTY_DONT_FLIP is
set, and define the interface so that a call to
flip.tqueue_routine() will always flip when called from the
interrupt service routine. Another approach that might be
workable is, in the case when the interrupt is called and
there is no flip buffer space, to queue a call to
receive_chars which will occur when flip buffer space
becomes available. You can mask serial interrupts from the
interrupt service routine, and unmask them in flush_to_ldisc
or receive_buf.

Unfortunately, I don't have much time to work on this, and
no real way to debug it. The main problem is that "it works
for me" :) As I see it, either patch is acceptable for the
time being, although I think I will continue to use mine to
avoid the rare possibility of a lost character.




Date: 2001-04-20 10:07
Sender: ijabbott

Logged In: YES
user_id=141198

Hi Greg!

I'm a little worried about what happens when the
tty->flip.tqueue.routine called by the patch fails (no
flip). The patch then causes receive_chars() to return
immediately. It would seem that the calling interrupt
routine would receive_chars() (and hence
tty->flip.tqueue.routine()) several times
(RS_ISR_PASS_LIMIT) without much being done. The interrupt
routine would then give up and return back to the main Linux
interrupt handling code.

If the interrupt is level-sensitive (e.g. it's from a PCI
card), then as soon as the interrupt handler ends, another
interrupt will be generated and the same thing will happen
over again, i.e. you could end up with the system being in
the interrupt handler pretty much all the time.

One compromise would be to give the tty->flip.queue.routine
one chance only as in this modified version of your patch:

-----------------------------[8<]--------------------------
diff -ur serial-5.05/serial.c serial-5.05ija/serial.c
--- serial-5.05/serial.c Thu Sep 14 17:40:26 2000
+++ serial-5.05ija/serial.c Fri Apr 20 10:57:47 2001
@@ -564,6 +564,9 @@

icount = &info->state->icount;
do {
+ if (tty->flip.count >= TTY_FLIPBUF_SIZE) {
+ tty->flip.tqueue.routine((void *) tty);
+ }
ch = serial_inp(info, UART_RX);
if (tty->flip.count >= TTY_FLIPBUF_SIZE)
goto ignore_char;
-----------------------------[>8]--------------------------

A better alternative would be to disable the receive
character interrupt too and to have the calling interrupt
routine take note of this. The tricky bit would be working
out when to enable it again, which needs a little thought to
get right.

Regards,
Ian Abbott.


Log in to comment.




Attached File ( 1 )

Filename Description Download
bfu-5.05.patch patch for serial-5.05. Download

Change ( 1 )

Field Old Value Date By
File Added 4275: bfu-5.05.patch 2001-03-15 17:32 gmp