Menu

#66 Fails to build with OpenSSL 1.1

open
nobody
None
5
2017-01-17
2016-10-30
No

Hi!

As the new major OpenSSL version is out (and is being actively pushed into next Debian release) I'd like to build tcltls with it, but currently it fails due to some significant changes in OpenSSL API.

I've managed to create a patch which ports tcltls to the new API (and doesn't lose compatibility with the old one, tested with 1.0.2). The changes are mostly trivial, but there are two places I'd like someone who's familiar with Tcl and/or OpenSSL internals better to review and probably fix.

The first suspicious change is this:
--- a/tlsIO.c
+++ b/tlsIO.c
@@ -936,6 +936,12 @@
dprintf(stderr,"CR! ");
errorCodePtr = ECONNRESET;
return -1;
+ } else if (rc == SSL_ERROR_SYSCALL && Tcl_GetErrno() == 0) {
+ /
Without this clause test tlsIO-2.10 (close on accept,
+ accepted socket lives) hangs /
+ dprintf(stderr,"Syscall error but zero error code from Tcl! ");
+
errorCodePtr = ECONNRESET;
+ return -1;
}
if (statePtr->flags & TLS_TCL_SERVER) {
err = SSL_get_verify_result(statePtr->ssl);

I've added it because of a hang in test tlsIO-2.10 (after a client immediately closes the opened connection, the server went to an infinite loop where SSL kept getting SSL_ERROR_SYSCALL (error from an underlying channel layer, i.e. Tcl) and Tcl_GetErrno() equals zero (no error). I'm not sure what it was (and 1.0.2 doesn't do that).

The second suspicious change was the following (among other similar ones):

In file tlsBIO.c
case BIO_C_GET_FD:
- if (bio->init) {
+ if (BIO_get_init(bio)) {
ip = (int )ptr;
if (ip != NULL) {
-
ip = bio->num;
+ //ip = bio->num;
+
ip = 0;
}
- ret = bio->num;
+ //ret = bio->num;
+ ret = 0;
} else {
ret = -1;
}
break;

Since there's no more way to access bio->num fiels from outside the libcrypto itself, I've just gave up and use zero instead of this num field. Probably, there could be a better way to deal with it.

The patch is attached.

1 Attachments

Discussion

  • Jeremy Sowden

    Jeremy Sowden - 2016-10-31

    I believe one can access bio->num via BIO_get|set_fd. Patch attached.

    J.

     

    Last edit: Jeremy Sowden 2016-10-31
  • Sergei Golovan

    Sergei Golovan - 2016-10-31

    It seems not. The BIO_set_fd() function essentially calls back BioCtrl(), and this is unfortunate, because it happens first time in BIO_new_tcl() when its state variable statePtr is still uninitialised. And SIGSEGV as a result.

     
  • Jeremy Sowden

    Jeremy Sowden - 2016-10-31

    The tcl BIO doesn't actually need to do any extra initialization, so BioNew can be empty. Things look a bit more complicated with BioFree, however. I'll take a proper look later this evening.

    J.

     
  • Jeremy Sowden

    Jeremy Sowden - 2016-10-31

    The tcl BIO is derived from the fd one. BIO_C_GET_FD and BIO_C_SET_FD are only meaningful for methods based around file-descriptors and are unimplemented by the other methods. Therefore, I reckon you can just remove any reference to bio->num altogether. Patch attached.

    J.

     
  • Jeremy Sowden

    Jeremy Sowden - 2016-11-01

    I started having a look at the other change. What's happening is that by the time SSL_accept is called the client has already closed the connexion and SSL gets a zero-length read during the handshake. In 1.0, that zero was returned to the caller. Now, we get -1 and need to use SSL_get_error to distinguish the try-again case for a non-blocking socket. We don't have a non-blocking socket and end up with a rather uninformative error code instead. The man-page hasn't been updated since 1.0:

       SSL_ERROR_SYSCALL
          Some I/O error occurred.  The OpenSSL error queue may contain more
          information on the error.  If the error queue is empty (i.e.
          ERR_get_error() returns 0), ret can be used to find out more about
          the error: If ret == 0, an EOF was observed that violates the
          protocol.  If ret == -1, the underlying BIO reported an I/O error
          (for socket I/O on Unix systems, consult errno for details).
    

    which is not helpful. I do wonder whether this behaviour is intended, but even if not we still need to handle it. I'll do some more digging this evening.

    J.

     
  • Jeremy Sowden

    Jeremy Sowden - 2016-11-08

    Apologies for the delay. Took another look and don't have anything really to add. Afaics, you're doing the right thing: if the socket at the other end is closed, SSL_accept and SSL_connect now return -1 and SSL_get_error returns SSL_ERROR_SYSCALL; if errno is not already set, then assuming a closed socket and setting it to ECONNRESET seems like the best option.

    Probably worth reporting the documentation bug upstream.

    J.

     
  • Sergei Golovan

    Sergei Golovan - 2016-11-11

    Thank you very much for the analysis. I'm attaching the patch with your modifications applied.

     
  • Jeff Lawson

    Jeff Lawson - 2017-01-17

    Can you test out the new TclTLS 1.7.11 that is now officially hosted at http://core.tcl.tk/tcltls/index ? It incorporates a number of improvements to modernize and fix build issues.