Menu

#1617 segmentation fault from oh_add_rdr..

closed-works-for-me
nobody
None
5
2011-05-09
2011-04-06
No

Hi,
Observed below segmentation fault. Event logs captured when segmentation fault happened are as well pasted below: (I was running an openhpi client and daemon from a separate linux m/c, which does discovery at an interval in a loop and checks for components powered state and event logs)

Starting program: /usr/sbin/openhpid -n -c /etc/openhpi/orig/openhpi.conf
[Thread debugging using libthread_db enabled]
[New Thread 0x2b253cda4fa0 (LWP 7334)]
[New Thread 0x41ef4940 (LWP 7335)]
[New Thread 0x428f5940 (LWP 7336)]
[New Thread 0x432f6940 (LWP 7337)]
[New Thread 0x410fd940 (LWP 7338)]
[New Thread 0x43cf7940 (LWP 7339)]
[New Thread 0x446f8940 (LWP 7362)]
[Thread 0x446f8940 (LWP 7362) exited]
[New Thread 0x446f8940 (LWP 7385)]
[New Thread 0x450f9940 (LWP 7445)]
[New Thread 0x45afa940 (LWP 7447)]
[Thread 0x446f8940 (LWP 7385) exited]
[New Thread 0x446f8940 (LWP 7476)]
[Thread 0x446f8940 (LWP 7476) exited]
[New Thread 0x446f8940 (LWP 7492)]
[Thread 0x446f8940 (LWP 7492) exited]
[New Thread 0x446f8940 (LWP 7508)]
[Thread 0x446f8940 (LWP 7508) exited]
[Thread 0x450f9940 (LWP 7445) exited]
[Thread 0x45afa940 (LWP 7447) exited]
[New Thread 0x45afa940 (LWP 7536)]
[New Thread 0x450f9940 (LWP 7538)]
[Thread 0x450f9940 (LWP 7538) exited]
[Thread 0x45afa940 (LWP 7536) exited]
[New Thread 0x45afa940 (LWP 7646)]
[New Thread 0x450f9940 (LWP 7662)]
[Thread 0x450f9940 (LWP 7662) exited]
[New Thread 0x450f9940 (LWP 7723)]
[New Thread 0x446f8940 (LWP 7725)]
[New Thread 0x464fb940 (LWP 7741)]
[Thread 0x464fb940 (LWP 7741) exited]
[Thread 0x45afa940 (LWP 7646) exited]
[New Thread 0x45afa940 (LWP 7792)]
[New Thread 0x464fb940 (LWP 7823)]
[New Thread 0x46efc940 (LWP 7880)]
[Thread 0x46efc940 (LWP 7880) exited]
[Thread 0x45afa940 (LWP 7792) exited]
[Thread 0x450f9940 (LWP 7723) exited]
[Thread 0x446f8940 (LWP 7725) exited]
[New Thread 0x446f8940 (LWP 8220)]
[New Thread 0x450f9940 (LWP 8222)]
[New Thread 0x45afa940 (LWP 8238)]
[Thread 0x45afa940 (LWP 8238) exited]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x432f6940 (LWP 7337)]
0x000000347da70454 in malloc_consolidate () from /lib64/libc.so.6
(gdb) bt
#0 0x000000347da70454 in malloc_consolidate () from /lib64/libc.so.6
#1 0x000000347da72a1a in _int_malloc () from /lib64/libc.so.6
#2 0x000000347da7486d in calloc () from /lib64/libc.so.6
#3 0x000000347fa33b82 in g_malloc0 () from /lib64/libglib-2.0.so.0
#4 0x00002b253bd62a8a in oh_add_rdr (table=0x95b4010, rid=238,
rdr=0x2aaaac0bc5b0, data=0x0, owndata=0) at rpt_utils.c:728
#5 0x000000000044f3b5 in process_hs_event (d=0x95b3ee0, e=0x2aaaac0c3010)
at event.c:371
#6 0x000000000044f5d1 in process_event (did=0, e=0x2aaaac0c3010)
at event.c:444
#7 0x000000000044f691 in oh_process_events () at event.c:495
#8 0x000000000044f885 in oh_evtpop_thread_loop (data=0x0) at threaded.c:104
#9 0x000000347fa48e04 in ?? () from /lib64/libglib-2.0.so.0
#10 0x000000347e6064a7 in start_thread () from /lib64/libpthread.so.0
#11 0x000000347dad3c2d in clone () from /lib64/libc.so.6
(gdb)

Event logs observed when segmentation fault occured:

callbackForBlade- {SYSTEM_CHASSIS,1}{SYSTEM_BLADE,9}
, element Failed Status = 0, element Powered State = 0, at time = 2011-04-05-18:36:22.
Event Type: HOTSWAP
From Resource: {SYSTEM_CHASSIS,1}{SYSTEM_BLADE,9}
Event Resource ID: 233
Event Timestamp: 2011-04-05 18:36:13
Event Severity: CRITICAL
HotswapEvent:
HotSwapState: EXTRACTION_PENDING
PreviousHotSwapState: ACTIVE
CauseOfStateChange: CAUSE_UNEXPECTED_DEACTIVATION

Event Type: HOTSWAP
From Resource: {SYSTEM_CHASSIS,1}{SYSTEM_BLADE,9}
Event Resource ID: 233
Event Timestamp: 2011-04-05 18:36:13
Event Severity: CRITICAL
HotswapEvent:
HotSwapState: INACTIVE
PreviousHotSwapState: EXTRACTION_PENDING
CauseOfStateChange: CAUSE_AUTO_POLICY

Event Type: HOTSWAP
From Resource: {SYSTEM_CHASSIS,1}{SYSTEM_BLADE,1}
Event Resource ID: 225
Event Timestamp: 2011-04-05 19:48:37
Event Severity: OK
HotswapEvent:
HotSwapState: NOT_PRESENT
PreviousHotSwapState: ACTIVE
CauseOfStateChange: CAUSE_SURPRISE_EXTRACTION

Event Type: HOTSWAP
From Resource: {SYSTEM_CHASSIS,1}{SYSTEM_BLADE,1}
Event Resource ID: 225
Event Timestamp: 2011-04-05 19:48:44
Event Severity: OK
HotswapEvent:
HotSwapState: INSERTION_PENDING
PreviousHotSwapState: NOT_PRESENT
CauseOfStateChange: CAUSE_OPERATOR_INIT

Event Type: HOTSWAP
From Resource: {SYSTEM_CHASSIS,1}{SYSTEM_BLADE,1}
Event Resource ID: 225
Event Timestamp: 2011-04-05 19:48:44
Event Severity: OK
HotswapEvent:
HotSwapState: ACTIVE
PreviousHotSwapState: INSERTION_PENDING
CauseOfStateChange: CAUSE_AUTO_POLICY

Regards,
Preeti

Discussion

  • Anton Pak

    Anton Pak - 2011-04-06

    2.15/2.16/trunk?

    I suspect plug-in didn't set correctly rdrs or rdrs_to_remove lists in oh_event structure.

     
  • preeti sharma

    preeti sharma - 2011-04-06

    fault happened at time Apr 5 19:50:04 . bad host name lookup is not an issue in my system as they come for IO_BALDES which are defined a partner device of SYSTEM_BLADES. IO_BLADES do not any IP address assigned.

     
  • Shyamala

    Shyamala - 2011-04-14

    Hi Preeti,

    We are not able to reproduce the issue with our setup (Configuration Details: c7000 Enclosure, BL465c G6, BL465c G5, BL685c G6, BL685c G7 & BL490c G6 servers, and running Openhpi2.17.1). It looks like issue is specific to the configuration. Please share the configuration details (like resources present in Enclosure, version of openhpi etc) of the setup in which issue is seen.

     
  • preeti sharma

    preeti sharma - 2011-04-14

    generated using openhpi client hpigensimdata

     
  • preeti sharma

    preeti sharma - 2011-04-14

    Hi shyamala,
    Configuration details are:

    OpenHpi version: 2.15.1
    Chassis detail : C7000
    Plugin : OA_SOAP
    Chassis configuration pasted below: (as well ataching simulationC7000.data)
    0 RPT: id = 1 ResourceId = 1 Tag = BladeSystem c7000 Enclosure G2
    1 RPT: id = 225 ResourceId = 225 Tag = ProLiant BL460c G7
    2 RPT: id = 226 ResourceId = 226 Tag = BLc-Class PCI Expansion Blade
    3 RPT: id = 227 ResourceId = 227 Tag = ProLiant BL460c G7
    4 RPT: id = 228 ResourceId = 228 Tag = BLc-Class PCI Expansion Blade
    5 RPT: id = 229 ResourceId = 229 Tag = ProLiant BL460c G7
    6 RPT: id = 230 ResourceId = 230 Tag = BLc-Class PCI Expansion Blade
    7 RPT: id = 231 ResourceId = 231 Tag = ProLiant BL460c G7
    8 RPT: id = 232 ResourceId = 232 Tag = ProLiant BL460c G7
    9 RPT: id = 233 ResourceId = 233 Tag = ProLiant BL460c G7
    10 RPT: id = 234 ResourceId = 234 Tag = BLc-Class PCI Expansion Blade
    11 RPT: id = 235 ResourceId = 235 Tag = ProLiant BL460c G7
    12 RPT: id = 236 ResourceId = 236 Tag = BLc-Class PCI Expansion Blade
    13 RPT: id = 237 ResourceId = 237 Tag = ProLiant BL460c G7
    14 RPT: id = 238 ResourceId = 238 Tag = ProLiant BL460c G7
    15 RPT: id = 239 ResourceId = 239 Tag = ProLiant BL460c G7
    16 RPT: id = 240 ResourceId = 240 Tag = ProLiant BL460c G7
    17 RPT: id = 241 ResourceId = 241 Tag = HP VC Flex-10 Enet Module
    18 RPT: id = 5 ResourceId = 5 Tag = Thermal Subsystem
    19 RPT: id = 6 ResourceId = 6 Tag = Fan Zone
    20 RPT: id = 242 ResourceId = 242 Tag = Fan Zone
    21 RPT: id = 243 ResourceId = 243 Tag = Fan Zone
    22 RPT: id = 244 ResourceId = 244 Tag = Fan Zone
    23 RPT: id = 245 ResourceId = 245 Tag = Fan
    24 RPT: id = 246 ResourceId = 246 Tag = Fan
    25 RPT: id = 247 ResourceId = 247 Tag = Fan
    26 RPT: id = 248 ResourceId = 248 Tag = Fan
    27 RPT: id = 249 ResourceId = 249 Tag = Fan
    28 RPT: id = 250 ResourceId = 250 Tag = Fan
    29 RPT: id = 251 ResourceId = 251 Tag = Fan
    30 RPT: id = 252 ResourceId = 252 Tag = Fan
    31 RPT: id = 253 ResourceId = 253 Tag = Fan
    32 RPT: id = 254 ResourceId = 254 Tag = Fan
    33 RPT: id = 255 ResourceId = 255 Tag = Power Subsystem
    34 RPT: id = 256 ResourceId = 256 Tag = Power Supply Unit
    35 RPT: id = 257 ResourceId = 257 Tag = Power Supply Unit
    36 RPT: id = 258 ResourceId = 258 Tag = Power Supply Unit
    37 RPT: id = 259 ResourceId = 259 Tag = Power Supply Unit
    38 RPT: id = 260 ResourceId = 260 Tag = Power Supply Unit
    39 RPT: id = 261 ResourceId = 261 Tag = Power Supply Unit
    40 RPT: id = 2 ResourceId = 2 Tag = Onboard Administrator
    41 RPT: id = 262 ResourceId = 262 Tag = Onboard Administrator
    42 RPT: id = 263 ResourceId = 263 Tag = LCD

    Regards,
    Preeti

     
  • Shyamala

    Shyamala - 2011-04-27

    Have tried the steps mentioned in the bug, unable to reproduce the issue and there is no more information.

     
  • Anton Pak

    Anton Pak - 2011-04-27

    Preeti,

    could you run openhpid under gdb and show backtrace?

     
  • preeti sharma

    preeti sharma - 2011-04-27

    Hi shyamala/Anton,
    I haven't seen issue again as well not sure of steps to reproduce this. This issue can be closed, if I see the issue again will raise another bug.

    Regards,
    Preeti

     
  • dr_mohan

    dr_mohan - 2011-05-09
    • status: open --> closed-works-for-me
     
  • dr_mohan

    dr_mohan - 2011-05-09

    We tried to reproduce this bug many times, but failed. So we are closing it. May be there are more variables involved. Once this bug is reproducible and the config is known please file a new bug.