|
From: Sam S. <sd...@gn...> - 2008-12-24 15:17:23
|
Don Cohen wrote:
> Sam Steingold writes:
> > Don Cohen wrote:
> > > > if you have a reproducible case, I am interested.
> > > It happens regularly, so I can reproduce it in that sense.
> > > If I record the relevant packets then I suppose it should
> > > be possible to reproduce it on demand by some approximation
> > > of replay. Or perhaps it would suffice for a start to simply
> > > describe the packet(s) that cause the error. Shall we start
> > > with that?
>
> > I would prefer something like "start clisp, open socket server,
> > connect to it from another shell using telnet, type '...' to
> > telnet, kill telnet, type '...' to clisp, observe the error".
>
> I understand, but this may be more difficult to produce, so let's
how about you compile clisp with DEBUG_OS_ERROR defined - this way you will see
which line in which file has signaled the error.
or, better yet, configure --with-debug and run clisp under gdb, setting a break
in prepare_error and send the backtrace here.
I suspect that the error is signaled by listen_char() which is called by
socket-status to ensure that a whole unicode char is actually available if a
byte is.
the code is:
if (FD_ISSET(in_sock,readfds) || (stream_isbuffered(sock) & bit(1)))
rd = (char_p ? listen_char(sock) : listen_byte(sock));
there is no error on in_sock, that has been checked, so, apparently, there is a
race condition here: an error (ECONNRESET) arrives _after_ select() but
_before_ listen_char() could finish it's work.
> start with this: The errors seem to come (many times recently) from
> packet traces that all look substantially the same:
>
> 11:40:18.824273 00:90:69:8a:f0:5d > 00:30:1b:2c:c9:cf, ethertype IPv4 (0x0800),
> length 78: IP 216.240.130.195.49199 > 64.27.16.100.http: S 2265407414:22654074
> 14(0) win 65535 <mss 1460,nop,wscale 1,nop,nop,timestamp 3704081566 0,sackOK,eol>
> 0x0000: 4500 0040 c003 4000 3f06 cf81 d8f0 82c3 E..@..@.?.......
> 0x0010: 401b 1064 c02f 0050 8707 5fb6 0000 0000 @..d./.P.._.....
> 0x0020: b002 ffff 3a2a 0000 0204 05b4 0103 0301 ....:*..........
> 0x0030: 0101 080a dcc7 cc9e 0000 0000 0402 0000 ................
> [tcp syn]
sorry, this looks like a total gibberish to me.
I admire people who understand these hex codes even more than I admire people
who can read assembly however.
|