Menu

#108 Some associatorname CIM operations hang

1.4.9
fixed
None
None
Function
2014-11-26
2014-09-04
Dave Heller
No

In some cases an associatorNames query can hang for no apparent reason, even though associator, reference and referereNames queries to the same provider work. This is due to an initialized variable in CIM-XML parser.

The symptom is something like:

$ wbemcli ain http://localhost/root/cimv2:CMPI_TEST_Person.name="Harry"
*
* wbemcli: Cim: (1) CIM_ERR_FAILED: Req handler timed out waiting for provider response
*

$ ps -ef | grep sfcbd
8147 root  S  sfcbd -proc:Main
8148 root  S  sfcbd -proc:Logger
8150 root  S  sfcbd -proc:HttpDaemon -ip:[::]
8151 root  S  sfcbd -proc:ClassProvider -class:$ClassProvider$ -location:sfcClassProviderSf
8153 root  S  sfcbd -proc:InteropProvider -class:$InterOpProvider$ -location:sfcInteropProvider
8154 root  S  sfcbd -proc:ProfileProvider -class:$ProfileProvider$ -location:sfcProfileProvider
8158 root  S  sfcbd -proc:InternalProvider -class:$DefaultProvider$ -location:sfcInternalProvider
8162 root  S  sfcbd -proc:ServerProvider -class:CIM_ObjectManager -location:sfcInteropServerProvider
8241 root  S  sfcbd -proc:TestAssociationProvider -class:CMPI_TEST_Racing -location:TestAssociationProvider
9709 root  S  sfcbd -proc:HttpDaemon -ip:[::] -reqhandler: 1

The call trace of the hung provider is like:

(gdb) bt
#0  0x00000030fdc0e87d in read () from /lib64/libpthread.so.0
#1  0x00007fb176d9d99e in spRcvAck (from=137) at msgqueue.c:605
#2  0x00007fb176d85f62 in xferResultBuffer (nr=nr@entry=0x7fb1700011c0, to=137, more=more@entry=1, rc=0, rc@entry=1, length=length@entry=256) at result.c:162
#3  0x00007fb176d865b9 in nextResultBufferPos (nr=nr@entry=0x7fb1700011c0, type=type@entry=2, length=256) at result.c:269
#4  0x00007fb176d8663b in __rft_returnObjectPath (result=0x7fb1700011c0, cop=0x7fb170001fd0) at result.c:480
#5  0x00007fb175d01de0 in TestAssociationProviderAssociatorNames (mi=<optimized out>, ctx=0x7fb1700008c0, rslt=0x7fb1700011c0, ref=<optimized out>, _RefLeftClass=<optimized out>,_RefRightClass=<optimized out>, role=0x0, resultRole=0x0) at cmpiTestAssociationProvider.c:244
#6  0x00007fb176db0f6f in associatorNames (hdr=0x25c6c30, info=0x25c5a20, requestor=<optimized out>) at providerDrv.c:2598
#7  0x00007fb176db8007 in processProviderInvocationRequestsThread (prms=0x25c4b20) at providerDrv.c:3522
#8  0x00000030fdc07f33 in start_thread () from /lib64/libpthread.so.0
#9  0x00000030fd4f4ded in clone () from /lib64/libc.so.6

The anomaly is, the provider thread is following a code path that it should only be following when HTTP chunking is in effect, and that should not ever be the case for an associatorNames query. That is, this thread should not be in xferResultBuffer(), from a call to nextResultBufferPos(), when executing an associatorNames query.

The explanation is: SFCB has a number of flags that control chunking behavior for both the provider and http req handler processes. These two processes need to talk to each other when fulfilling a request. If there is a mismatch in the state of these flags, the two sides can get out of sync. In this case, the provider is waiting for a response from the req handler that will never come.

The flag that controls the provider side gets attached an existing data field read in by the parser. There is a possibility this data field may be uninitialized. The way the data is passed in from the parser (buried under a void*) it makes it hard for the compiler to catch this condition.

The fix is to initialize the variable. This actually makes the AN call behave the same as the RN call, fixing a minor inconsistency in the code.

This is LTC 114042.

Discussion

  • Dave Heller

    Dave Heller - 2014-09-05

    Commit [334042] for v1.4

    This bug is not applicable to v1.3

     

    Related

    Commit: [334042]

  • Dave Heller

    Dave Heller - 2014-09-05
    • status: open --> pending
     
  • Dave Heller

    Dave Heller - 2014-11-26
    • Status: pending --> fixed
     

Log in to post a comment.