TLS: OpenSSL Tcl Extension / Bugs / #38 Stalled async reading, and empty reads

Erik Leunissen - 2008-04-18

tls_issues.tar.gz

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alexandre Ferrieux - 2008-05-05

Logged In: YES
user_id=496139
Originator: NO

Erik, maybe you could bump the prio here too. I cannot, having no status in TLS :-}

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2008-05-09

Logged In: YES
user_id=113903
Originator: YES

OK, set priority to 8 (= the same priority as was assigned to the corresponding issue in the Tcl tracker).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2008-05-09

priority: 5 --> 8
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2009-06-08

The sample file used thus far to exercise "the choking channel" has received company ...

The tarball "more_stalling_gifs.tar.gz" holds four more of them.

Each of these gif images gets stuck at a different byte offset (i.e. the notifier not responding to the last bytes having arrived).

When the exercise for any of these files is repeated, the byte offset where the stalling occurs is always exactly the same.

All tests performed with the scripts already here, using Linux, Tcl8.5.7.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2009-12-11

Found another gif that got stuck. Added as "stalling_6.gif"

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2010-06-16

The stalling reads issue was explored at the script level. The exploration revealed a consistent pattern which is determined by the following three variables: the length of the byte sequence sent to the recipient, the buffer size of the channel and the block size by which reads are performed from the channel. Additionally, a still unexplained constant of 16384 bytes plays a role in the pattern. The rules by which this constant and these variables interact to exhibit the observed pattern is described in detail in section B of the report, see attached the file report.pdf.

With these results, the problem domain has been narrowed down in terms of programming constructs at the script level. This shows that the problem is much more generic than the set of gif files that have exhibited the problem thus far.

Whether stalling will occur, and exactly at which position in the byte sequence, has shown to be predictable to a very large degree, provided that a read size is specified for the read commands. Only when (non-empty) short reads occur, the predictability is subverted.

As long as the issue hasn't been fixed, the pattern rules can be taken advantage of to work around the misbehaviour. Work-arounds, as well as other implications, are provided in section C of the report.

The results of the script-level exploration, combined with a superficial inspection of the C code of the TLS extension, led to the following proposed explanation of the pattern at the C level (see section E. in the report for details and reservations):

*Stalling occurs upon those specific occasions that a successful, script-level read command caused an internal buffer to become filled up or emptied exactly.*

Along with this report comes an updated distribution tls_issues.2.tar.gz which holds the tools to verify the general outcomes of this exploration. May they provide directions for any further search for the cause of the misbehaviour.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2010-06-16

Removed the gif files that up to now were the only witnesses of stalling behaviour. With the outcomes of the latest exploration (see the file report.pdf), you can now easily create and exercise your own stalling byte sequences in a much more generic and versatile way.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2010-06-17

report.pdf

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2010-06-23

Testing environment, belonging to report.pdf

tls_issues.3.tar.gz

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Leunissen - 2010-06-23

Added corrected distribution tls_issues.3.tar.gz

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alexandre Ferrieux - 2010-06-23

Excellent report and tools !
Now trying 'exercise 11 10 10' as suggested in the report, I see a random outcome: sometimes stalling, sometimes not. I also get both outcomes with strace, and the comparison shows that the main difference is that in the non-stalling case there are several recv() calls which yield EAGAIN. None in the stalling case... FWIW.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alexandre Ferrieux - 2010-06-28

A few more observations. Still trying to make the minimal [exercise 11 10 10] a sure-fire, I played with priorities. Nicing the client and Boosting (as root) the server brings the stall rate to 80%. Adding a third process busylooping at normal priority brings the rate to 100%.

Given the fact that slowing down the client relative to the server helps, it is likely that single-stepping in gdb in the client could still be done in the stalling case (after removing the timeout).

I'm saying this in the hope that somebody knowing TLS did exactly that ;-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stalled async reading, and empty reads

Group

Searches

Help

#38 Stalled async reading, and empty reads

Discussion