From: Stanislav M. <st...@me...> - 2008-08-06 09:09:21
|
Hi, I and a few other users are seeing sshd failing with Couldn't obtain random bytes (error 604389476) and other ssl-related application failing randomly in user mode linux guests and I suspect a problem in openssl that got triggered by some change in UML. I reviewed the RAND_poll function in rand_unix.c (statically, no time for building a debug version now) and have following suspicions: === For Linux: int r; ... this has random bytes from stack ... if (poll(&pset, 1, usec / 1000) < 0) usec = 0; else try_read = (pset.revents & POLLIN) != 0; ... Let's say that the poll timed out (i.e. returned 0) try_read remains 0, r still has garbage while ((r > 0 || (errno == EINTR || errno == EAGAIN)) && ... Let's say that the garbage was negative. We are out of the loop and errno has bogus data (successfull/timed out poll did not set anything) === For other Unices there's additional problem: If the select select's successfully and immediately, it can leave the time not slept unchanged in the time argument (which is IMHO fully legal, if it finds the bytes immediately). If the read then does not get all the needed bytes, the code if (usec == 10*1000) usec = 0; kicks in and we are out of the loop again. Suggested changes: - add r = -1; inside the do loop after the int try_read = 0; - change if (usec == 10*1000) into if (r < 0 && usec == 10*1000) Regards -- Stano |