Standards Based Linux Instrumentation / Bugs / #1310 "--- Warning: fd is closed" message from binary intf app

Manish Tomar - 2008-08-13

Logged In: YES
user_id=2179806
Originator: YES

Please note that this does not appear in x86 with glibc. It seems to be appearing only on uclibc with big endian processor.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-20

assigned_to: buccella --> smswehla
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-21

priority: 5 --> 7
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sean Swehla - 2008-08-21

Logged In: YES
user_id=1939165
Originator: NO

Do you get any messages in your log file? There's only one abort() call in spRcvMsg, and it should be logging an error before the call is made.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-22

client test program (sfcc)

v2test_ei-ami.c

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-22

Logged In: YES
user_id=1550470
Originator: NO

1) I am trying to recreate this problem. I have attached the test program that I am using to do so. I have run this program in a BASH for loop for 200 iterations, and have not seen the error you describe. Please check the attached source code and verify that it is similar to what you are using.

2) As Sean suggested, log output would be very useful. Please consult your syslog and attach a copy of the messages from sfcbd.

3) Trace output would also be useful. Please run "sfcbd -t 65536 2> sfcb-output" and recreate your problem scenario. Then attach the sfcb-output file to this tracker item along with the output for #2.
File Added: v2test_ei-ami.c

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-22

status: open --> open-accepted
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Venkatesh Ramamurthy - 2008-08-22

Logged In: YES
user_id=2180106
Originator: NO

Chris,
I believe that problem will not happen if the test util is run 200 times from the BASH shell as test process termination cleans up everything. The connect()/enumerate()/release() needs to be looped several times from within the test application.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-23

Logged In: YES
user_id=1550470
Originator: NO

I have changed my test program so that the for loop executes inside of it as you suggested. I still do not see the problem occur. My virtual MIPS box is using glibc... perhaps the issue is with uclibc?

Could you please attach the information I requested in my previous post? This would help us.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Manish Tomar - 2008-08-23

Logged In: YES
user_id=2179806
Originator: YES

Chris,
It might be difficult to give the trace output as it is occuring in embedded system where we do not have much space and error occurs after the space is completely filled up. I'll attach the syslog output once the error occurs.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Manish Tomar - 2008-08-25

Logged In: YES
user_id=2179806
Originator: YES

Following is outputted to syslog when this occurs:

Aug 25 19:35:05 (none) lighttpd[9697]: --- Warning: fd is closed: Resource
temporarily unavailable
Aug 25 19:35:05 (none) lighttpd[9697]: ### 0 ??? 0-0

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Manish Tomar - 2008-08-26

Logged In: YES
user_id=2179806
Originator: YES

Somtimes, second output is different:

Nov 22 09:21:07 (none) lighttpd[8345]: --- Warning: fd is closed: Resource temporarily unavailable
Nov 22 09:21:07 (none) lighttpd[8345]: spGetMsg receiving from 14 0-14 Bad address

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sean Swehla - 2008-08-26

Logged In: YES
user_id=1939165
Originator: NO

When you say "space is completely filled up", do you mean space on the file system? Does the error occur only after space has filled completely? Does it ever occur when there is still space available?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Manish Tomar - 2008-08-26

Logged In: YES
user_id=2179806
Originator: YES

Mostly, yes. The error normally occurs after space in /tmp gets filled up due to loads of output from sfcb. Other areas are Read-only and are not advised to be accessed frequently. Recently, the error has been quite frequent and I'll try it to get it with the trace. Do you want the trace on the client app also (i.e. app linking to libcimcClientSfcbLocal.so)? Our just in sfcb server?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-27

Logged In: YES
user_id=1550470
Originator: NO

manishtomar, I think what Sean was trying to determine is if the filling up of the filesystem is the cause of the error. Can you confirm this? Does the error occur when there is plenty of disk space available?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Manish Tomar - 2008-08-28

Logged In: YES
user_id=2179806
Originator: YES

What I meant is if sfcbd is started with trace the trace log is not accurate as it eats up all the space in filesystem. We normally run it without trace having enough space in filesystem. The error still occurs.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-08-30

Logged In: YES
user_id=1550470
Originator: NO

-Aug. 29 Braindump-

This error starts in msgqueue.c:spGetMsg() where the recvmsg() syscall returns 0. According to the glibc manpage for this function, this indicates that the peer closed the socket. The strange part is that errno is set to EINTR, which indicates that the system call was interrupted. If is is what happens, we should just retry; sfcb takes care of this, but the condition (immediately below recvmsg() in the if block) is never reached, since it expects EINTR only if recvmsg() returns an error (<0). uClibc's implementation of recvmsg() may be setting the return code wrong. Or perhaps recvmsg() isn't setting errno at all, but is set by some other function. I need to investigate what errno should be for all return cases.

Another possibility is that sfcb closed the socket on itself... perhaps some other thread running closed the wrong socket. Tracing on the msgqueue component should tell us if this happens; I've been tracing for the past 2 hours and haven't hit the error yet.

The crash is the result of an abort() call in a case statement at the end of spRcvMsg(). If EINTR was detected correctly previously, this case statement is not reached; instead sfcb retries until a successful return from recvmsg().

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nobody/Anonymous - 2008-10-02

2049872-fd_is_closed.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-10-02

2049872-fd_is_closed.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-10-02

committed to HEAD.

This LTC bug #47412.
File Added: 2049872-fd_is_closed.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Buccella - 2008-10-02

status: open-accepted --> pending-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

SourceForge Robot - 2008-12-02

status: pending-fixed --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

SourceForge Robot - 2008-12-02

This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 60 days (the time period specified by
the administrator of this Tracker).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

"--- Warning: fd is closed" message from binary intf app

Group

Searches

Help

#1310 "--- Warning: fd is closed" message from binary intf app

Discussion