Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#29 Seg.fault on exit when -open'ing a socket more than once

closed-fixed
None
5
2005-07-20
2005-06-17
Hemang Lavana
No

Hi,

I am attaching a simple script which does a
segmentation fault on solaris but not on linux. The
script has two parts to it: server and client. The
server part invokes an interactive application (
/bin/sh shell here) and starts a listening socket
server on the specified port. The client part opens a
socket connection to the server and once the connection
is established, it uses expect's interact commands to
let the client interact with application invoked. Note
that two separate expect sessions (read and write) are
spawned for the socket channel on both the server and
the client side using "exp_spawn -open $sockid" cmd. It
runs into segmentation fault when the client side
terminates the shell by invoking the "exit" command.

The server and client parts need to be invoked from two
separate terminals. First invoke the server part and it
will wait for the client to accept connections. Then
invoke the client from a diffferent terminal and it
will establish socket connection to the server and will
return the shell prompt. At this time you can run any
unix commands. When you invoke the exit command, both
the server and client runs into Segmentation fault".

# Start server
godel:104> uname -a
SunOS godel 5.8 Generic_108528-17 sun4u sparc SUNW,Ultra-4
godel:105> exp_shell server localhost 23000
Tcl version: 8.4.9
Exp version: 5.43.0
exp_spawn /bin/sh
SUCCESS: exp_shell server started localhost 23000
$ cleanupNexit invoked
closing spawn_id=exp7
closing spawn_id=exp8
Segmentation fault
godel:106>

# Start client
godel:96> exp_shell client localhost 23000
Tcl version: 8.4.9
Exp version: 5.43.0
SUCCESS: exp_shell client established connection to the
server ...
$ echo hello world
hello world
$ exit
closing spawn_id=exp5
exp_close sid=exp5 error=close: spawn id exp5 not open
closing spawn_id=exp6
Segmentation fault
godel:97>

This problem does not occur on linux box. Any idea what
is happening here.

Thanks,
Hemang.
--------------Analysis by Andreas--------
I can confirm that there is a crash. On both sides.

Note:
In my 1st try I got a SegFault from the client, and
BusError from the
server.
In my 2nd try both sides crashed with BusError.

This might be because I compiled Tcl/tk/Expect/Tclx
with --enable-symbols=all.

Client crash
Is in Tcl_GetChannelName, in 'exp_close', in
'Exp_CloseObjCmd'.

The channel structure given to the function has bad
pointers
in two places, for statePtr, and inQueueTail

(gdb) p* (Channel*) chan
$3 = {state = 0x49, instanceData = 0x61616161,
typePtr = 0x86998,
downChanPtr = 0x80358,
upChanPtr = 0x0, inQueueHead = 0xef71e690,
inQueueTail = 0x18}

The crash happens when 'state' is dereferenced.

1742 return statePtr->channelName;

As I suspected, this looks like a memory smash.

Server crash
In the same location. Again the Channel structure is
bogus.

Here however it actually looks completely cleaned up. I.e.
expect is acessing the structure after it has been
released
already. Everything contains the guard pattern 0x61616161
used to overwrite released memory areas.

Hm. It doesn't seem to be a double-close ... Belay
that. It is a double close of the socket, I had to look
at channel_orig, not channel ...

Uh, oh. Do you have two expect channels using the same
socket ?

Client log:

exp_close (7ffa0 -> 80088)
** exp_close (7ffa0 -> 7aeb0) ORIG
closing spawn_id=exp5
exp_close sid=exp5 error=close: spawn id exp5 not open
closing spawn_id=exp6
exp_close (803c8 -> 7ff50)
** exp_close (803c8 -> 7aeb0) ORIG

The ** markers show that two expect channels have the
same channel_orig, causing the second close of that
channel to dereference through an already freed pointer.

Server log ... Not needed. It is the same situation.
One Tcl channel closed by two expect channels during exit.

... Yes. See lines 92-98 and 218-223 of exp_shell.txt.

Ok, looking back at your first mail about this I see
that you wrote

Note that two separate expect sessions (read and write)
are spawned for the socket channel on both the server
and the client side using "exp_spawn -open $sockid" cmd.

On first reading this did not set any flags, and I
believe I misunderstood this as well as meaning that
you either had to expect chanels with 2 sockets, or 2
sockets for one expect channel.

In hindsight it becomes clear that you meant that you
had two expect channels connected to one socket channel.

And this also the problem. Both expect channels close
their subordinate socket, and the second time this
happens a released structure is accessed, containing
bogus pointers, causing the crash.

Linux escapes this not because it is not freeing things
twice, but because its memory allocator is likely
different, keeping the released structures in a state
where the second dereferncing still has valid pointers
and such, despite the structure actually being freed.

--
Andreas Kupries

-----------------some more info-----------------
Andreas Kupries wrote:

>>>>Ideas for solutions ...
>>>>
>>>>- Are the two expect sessions needed ?
>>>> I.e. could the script still work using only one
expect channel ?
>>>>
>>>>

I need to think it through, but I suspect it may not be
possible. The script is using interact command to tie
stdin to socket write and socket read to stdout on the
client side and it is using interact to tie socket read
to stdin of the application (/bin/sh) and stdout of
application (/bin/sh) to socket write on the server
side. Hence I need two expect sessions for a single
socket channel on each side.

>>>>- It seems that the expect core has to track which
Tcl channels are
>>>> connected to expect channels, not only on a
per-expect-channel basis,
>>>> but in some (thread-)global structure, so that
it can refcount the Tcl
>>>> channels, and close the channel only when the
last of the users is
>>>> closing.
>>>>
>>>>

Should it close on the call from last user or should it
close on the
first call and ensure that the rest of the calls are
ignored?

>>
>>Things I forgot to write up ....
>>
>>* This problem most likely exists in older versions
of Expect as well.
>> You might wish to test that.
>>
>>

Yes the problem occurs in older versions as well (at
least in
tcl8.4.6/expect5.40).

>>* Please open a bug for this problem at
http://expect.sourceforge.net/
>> as this is definitely a bug in expect.
>>
>>

Will do.

>>* Asking if the script can work with one expect
channel was not meant
>> to mean that the problem does not have to be fixed
in expect. Just
>> if there is a workaround you can live with while
expect is being
>> worked on.
>>
>>

Sure I understand. I can probably get away with not
invoking close/wait
at all in my script or have some additional code to
keep track of expect
sessions associated with the same socket.

>>This problem is also something we might wish to make
Don Libes aware of.

Discussion

  • Hemang Lavana
    Hemang Lavana
    2005-06-17

    exp_shell script to reproduce this problem

     
    Attachments
  • Patch to expect to fix the problem

     
    Attachments
  • Logged In: YES
    user_id=75003

    This problem can be fixed by adding a hashtable to the
    thread-local storage which tracks which channels have been
    opened via -open and/or -leaveopen. Tracking, i.e. the
    channels are refCounted and only the last user is permitted
    to close the channel, if it wishes to do so.

     
    • assigned_to: nobody --> andreas_kupries
    • summary: segmentation fault on solaris --> Seg.fault on exit when -open'ing a socket more than once
     
  • Logged In: YES
    user_id=75003

    A fix was committed to the CVS Head.

     
    • status: open --> closed-fixed