Re: [Sshpass-devel] SIGCHLD blocking?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 05/07/2018 20:53, Martin Galvan wrote: 

> Hi Shachar, thanks for the answer.
> 
> 2018-07-05 14:39 GMT-03:00 Shachar Shemesh <sh...@sh...>:
> 
>> In sshpass's particular case, however, you are right - there is a race in the use case you mention. The reason there is a case is not because the child might send a SIGCHLD before we block it. It is because we call "waitpid" after we call pselect (thus, if the signal arrives before we block it, we won't get around to releasing the process until the select returns for whatever other reason).
> 
> Why would pselect return, other than if there happens to be some
> output on the term? I don't fully understand the PTY magic here, so
> bear with me.

I suggest you read about how pselect works. It will return if there is
output on the TTY or if a signal arrives while it's sleeping. It will
also return if the process exited and closed the TTY (the TTY would then
be available for read, with the read returning 0 bytes to indicate the
other side has closed). 

This is also the reason that the race you found in the previous mail is
not a serious issue. If the child exited before we even managed to block
SIGCHLD, in most cases pselect will not sleep because the TTY would
close. 

> While we're at that, there's a comment saying that handleoutput will
> return a negative value if the slave end of the PTY is closed.
> However, looking at handleoutput I couldn't find a case where the
> return value would be negative. Am I missing something?

No. It's a bug. Worse, if read returns -1 we write to buffer at location
-1. It should definitely be fixed.

> And since it can't hurt to ask: I noticed there's a check for
> 'terminate' at the beginning of the do-while loop. If terminate is not
> zero, we'll do wait_id=waitpid( childpid, &status, 0 );. Is this just
> a way to make sure we wait on the child process in case SSH (or
> whatever program we end up running) errored out?

If we're asked to terminate (i.e. - we receive a SIGTERM) we pass it on
to ssh. We then wait for it to exit.

Since the code currently passes the signal on to ssh, maybe it makes
more sense to just continue what we're doing while we wait.

> Also, what if pselect
> returns -1 for some reason other than us getting a signal (e.g.
> ENOMEM)? Wouldn't we be stuck in an infinite loop since the WNOHANG
> waitpid would always set wait_id to zero?

Only if the pselect error repeats. These errors are, typically
(sometimes?), transient. 

This is a corner case where there is no good handling to be had. If a
system call fails due to reasons that are outside of our program's
influence, it's never clear what to do. 

So, all in all, good questions :-) 

Shachar