#285 Large SSL GET/POST transactions are sometimes truncated

open
5
2009-07-15
2009-07-15
John Caruso
No

When using the https* client routines (e.g. ns_httpsget) to get a page from a server, SSL transactions are sometimes truncated. This happens with a AOLserver 4.5.1 and nsopenssl 3.0beta26 client and server. It also happens if the client is running AOLserver 3.4.2 / nsopenssl 2.x, but the server *must* be running AOlserver 4 / nsopenssl 3.0beta26. We've tested and verified this bug on Redhat Enterprise Linux 4 and 5 and Mac OS X.

The problem does not happen if the ns_httpsXXX call is replaced by ns_httpXXX (i.e., if HTTP is used rather than SSL).

Here's sample output from a page that demonstrates the problem (using an RHEL4 client/server):

-------------------------------------------------
Page Size | #Errors (out of 1)
70000 bytes | 0 errors
71000 bytes | 0 errors
72000 bytes | 0 errors
73000 bytes | 0 errors
ERROR: Expected 74000 bytes, but got 73728 bytes (off by 272 bytes)
74000 bytes | 1 errors
ERROR: Expected 75000 bytes, but got 73728 bytes (off by 1272 bytes)
75000 bytes | 1 errors
ERROR: Expected 76000 bytes, but got 73728 bytes (off by 2272 bytes)
76000 bytes | 1 errors
ERROR: Expected 77000 bytes, but got 73728 bytes (off by 3272 bytes)
77000 bytes | 1 errors
ERROR: Expected 78000 bytes, but got 73728 bytes (off by 4272 bytes)
78000 bytes | 1 errors
ERROR: Expected 79000 bytes, but got 73728 bytes (off by 5272 bytes)
79000 bytes | 1 errors
ERROR: Expected 80000 bytes, but got 73728 bytes (off by 6272 bytes)
80000 bytes | 1 errors
ERROR: Expected 81000 bytes, but got 73728 bytes (off by 7272 bytes)
81000 bytes | 1 errors
82000 bytes | 0 errors
83000 bytes | 0 errors
84000 bytes | 0 errors
85000 bytes | 0 errors
-------------------------------------------------

In this case, the errors started at a page size of 74000 bytes and continued through a page size of 81000 bytes. The specific sizes that produce errors vary from test to test and from platform to platform, but we haven't found any platforms that work consistently at all sizes.

The attached file (sslbug.tcl) demonstrates the problem; just copy this file to the top-level context of a web server running AOLserver 4.x/nsopenssl 3.0beta26 and then navigate to https://<server>/sslbug.tcl, and you'll see something like the output above. If you comment out the ns_httpsget and use ns_httpget instead, you'll see that the bug disappears.

The only requirement for the bug appears to be an AOLserver 4 / nsopenssl 3.0beta26 server, with any AOLserver client. In particular, we were unable to reproduce the bug in these scenarios:

- AOLserver client talking to an Apache server
- AOLserver client talking to a Java server
- wget client talking to an AOLserver server
- Firefox/IE client talking to an AOLserver server

Also, this happens either within the same AOLserver/nsopenssl client and server or with two different ones.

Instrumenting the code shows that what's happening in the error cases is that ns_sockselect is being called and is indicating that the SSL socket is ready for reading, even though there are 0 bytes available to read--as you can see from this debugging output:

[15/Jul/2009:13:10:43][6126.9034656][-default:5-] Debug: _ns_https_readable: after ns_sockselect(openssl14), sel='openssl14 {} {}', nread=0

The subsequent read of the zero readable bytes causes the client routine to conclude that the transaction is finished. However, it doesn't matter if the client routines continue waiting instead--the missing/truncated data will NEVER show up on the socket. The only "workaround" we've found is to insert a delay before the data starts being read by ns_https{post,get}; in our testing, 300 milliseconds resolves the problem consistently on RHEL4, and a mere 1 millisecond delay resolves the problem on OS X (but note that this ONLY works if the client hasn't read any data yet...if it reads any data at all, no amount of sleeping will prevent the bug from occurring). While this demontrates that the problem is somehow timing-related, it's of course not a fix.

So basically, it appears that data may be lost/truncated for an SSL socket when a client starts reading the data before all of it has been received.

Discussion

  • John Caruso

    John Caruso - 2009-07-15
     
  • John Caruso

    John Caruso - 2009-07-15

    By the way, I say "truncated" because our testing indicates that the missing data all comes from the end of the transaction--so for the 74000-byte test above, the first 73728 bytes are received by the client but the last 272 bytes are lost/truncated.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks