Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#1828 openhpid refuses new connections after some time

Future
open
nobody
IPMI Direct plugin
2
2014-10-08
2014-03-04
Alex Jones
No

I am experiencing a problem where openhpid becomes unresponsive to all client connections. It returns SA_ERR_HPI_NO_RESPONSE for all API connections. This is in both 3.2.1 and 3.4.0. This is using the IPMI direct plugin.

Here is what happens:

We have a rogue process that connects to openhpid via the C API.  Then it crashes, and starts up again 1 second later. Crashes, starts up, etc.

This is causing openhpid to not release socket descriptors.

An "lsof -p" shows 1024 socket descriptors stuck in CLOSE_WAIT.  (1024 is the max file descriptor limit for this user on this machine.)

When we get into this situation I sent openhpid an ABRT signal, and there are 1024 threads most all of which are blocked on:

(gdb) bt

0 0x00007f13236b41eb in pthread_cond_timedwait@@GLIBC_2.3.2 ()

from /lib64/libpthread.so.0

1 0x00007f1323dc74c5 in ?? () from /usr/lib64/libgthread-2.0.so.0

2 0x00007f13238e0ebf in ?? () from /usr/lib64/libglib-2.0.so.0

3 0x00007f13238e1711 in g_async_queue_timed_pop ()

from /usr/lib64/libglib-2.0.so.0

4 0x0000000000424c0f in oh_dequeue_session_event ()

5 0x00000000004197a4 in saHpiEventGet ()

6 0x000000000040b820 in service_thread(void, void) ()

7 0x00007f13239342d8 in ?? () from /usr/lib64/libglib-2.0.so.0

8 0x00007f1323931db6 in ?? () from /usr/lib64/libglib-2.0.so.0

9 0x00007f13236aff05 in start_thread () from /lib64/libpthread.so.0

10 0x00007f1322d8210d in clone () from /lib64/libc.so.6

Attached to this bug is /var/log/messages with "openhpid -v". The problem starts to happen at 18:43:08.

1 Attachments

Discussion

  • dr_mohan
    dr_mohan
    2014-10-08

    • labels: --> IPMI Direct plugin
    • Subsystem: --> IPMI Direct plugin
    • 3.5.0: 3.5.0 --> Future