Menu

#2741 Core dump due to Datatype mismatch of provider MOF with that of Base MOF,CIM_Job.Mof for property "PercentComplete"

Function
closed-wont-fix
sfcb
2
2014-11-26
2014-10-12
No

Hi Chris,

As said in the bug#2740, we found that the core is happening due to mismatch of datatype for property "PercentComplete" in provider and base MOF i.e.,CIM_Job.mof file.

So tried to use override in derived class MOF file to make the changes reflected in it in baseclass.but it didn't solved the problem.

Kindly suggest a way to over come this problem because by making PercentComplete to same as in base class in derived MOF it worked.

Derived Class Mof File:

[Override ( "PercentComplete" ), 
 Description (
      "A free-form string that represents percentage complete of the job "
      "additional, implementation-specific details." ),
   ModelCorrespondence {
      "CIM_ManagedSystemElement.OperationalStatus" }]
string PercentComplete;

Related

Bugs: #2740

Discussion

  • P.Raveendra Reddy

    During debugging through GDB we found type = 144 i.e., CMPI_uint16 ((8+1)<<4) instead of CMPI_string.

    Breakpoint 16, __ift_setProperty (instance=instance@entry=0x29a1f7d0,
    name=0x2a30d300 "PercentComplete", value=0x0, type=144) at instance.c:350
    350 CMPIData data = { type, CMPI_goodValue, {0LL} };
    (gdb)

     
  • Dave Heller

    Dave Heller - 2014-10-20

    Hi Chitrak,

    Is this the same as [bugs:#2740]? I'm unclear why you opened two bugs here.

    So, the stacktrace from 2740 is not from a sfcbd process, but rather a wsman process, correct? I am trying to understand this trace. Even though there are some XML operations done in the wsman part of the code, wsman itself is calling SFCB via it's localconnect interface? But it looks like it is not going thru the SFCC client interface, which dlopen()s a SFCB library, but rather calling into SFCB code directly, as if the SFCB code is statically, rather than dynamically linked?

    Regardless: it seems like you have determined the root cause of the crash to be a type mismatch in the data passed to SFCB. From the small bit of SFCB code being executed here, that seems to be the case. The application is passing a property reference of type CMPI_chars, so that is how the SFCB code is interpreting it. And yes, based what can be seen in the stack frame for the ClInstanceGetPropertyAt(), it looks like the data->value (which is a union of many possible types) does not contain valid data.

    There are some routines in SFCB that check the passed data type against that defined in the repository (and hence, the .mof file), but those routines are at an earlier point in the SFCB code, compared to the apparent entry point as it is being called here by wsman. So the SFCB code has no choice here but to accept what it is being told.

    If you are using localconnect, I would be curious to know if the problem occurs if you set wsman to connect over XML instead (cim_client_frontend=XML in openwsman.conf).

    If you think there is a bug in SFCB it would be helpful if you could reproduce this independent of wsman, i.e. using wbemcli, cimcli or SFCC.

     

    Related

    Bugs: #2740

    • Chitrak Gupta

      Chitrak Gupta - 2014-10-24

      Hi Dave,

      we are upgrading SFCB from 1.3.10 to 1.4.8. After upgrade we are facing below issues. All the below issues were not observed in 1.3.10 and there is no change in our provider or MOF.

      We have overridden certain properties, for example “PercentComplete” is defined as uint16 in base CIM_LifecycleJob and as “string” our derived class. It seem latest version of sfcb is not allowing override of properties and results in SIGBUS on enumeration.

      We tried the following options
      1. Modified the implementation in provider to define “PercentComplete” property as UINT16. This resolved the issue
      2. Modified the data type in Base CIM class to String from UINT16. This also resolved the issue
      3. “Override” is mentioned in the derived mof. And it is not changing the behavior

      Also we are seeing similar issues on certain methods and association. All point to the similar root cause, mismatch in base and derived class data types.

      We also tried by removing the Openwsman interface and tested it by writing a simple cim-client for performing enumeration. Both the xml and binary interface results in same failure.

      This makes us believe that there is some issue in merging the parent and child classes where overrides may not have been handled correctly. And we have also observed code changes in mofc and hashtable creation

      Following is the stack trace when we tried to read the structure that contain the data for a property

      #1 0x29f87548 in ClInstanceGetPropertyAt (inst=0x41ef48, id=4, data=0x2a0b1bd0, name=0x2a0b1c28, quals=0x0) at objectImpl.c:2376
      2376 const char str = ClObjectGetClString(&inst->hdr,
      (gdb) print
      data
      $14 = {type = 5888, state = 0, value = {uint64 = 698606012, uint32 = 698606012, uint16 = 57788, uint8 = 188 '\274', sint64 = 698606012, sint32 = 698606012, sint16 = -7748, sint8 = -68 '\274',
      real64 = 3.4515723050735763e-315, real32 = 7.27781279e-14, boolean = 188 '\274', char16 = 57788, inst = 0x29a3e1bc <__xmlIOErr+1564>, ref = 0x29a3e1bc <__xmlIOErr+1564>,
      args = 0x29a3e1bc <__xmlIOErr+1564>, filter = 0x29a3e1bc <__xmlIOErr+1564>, Enum = 0x29a3e1bc <__xmlIOErr+1564>, array = 0x29a3e1bc <__xmlIOErr+1564>, string = 0x29a3e1bc <__xmlIOErr+1564>,
      chars = 0x29a3e1bc <__xmlIOErr+1564> "\t", dateTime = 0x29a3e1bc <__xmlIOErr+1564>, dataPtr = {ptr = 0x29a3e1bc <__xmlIOErr+1564>, length = 0}, Byte = -68 '\274', Short = -7748, Int = 698606012,
      Long = 698606012, Float = 7.27781279e-14, Double = 3.4515723050735763e-315}}
      (gdb) frame 0
      #0 0x29f827aa in ClObjectGetClString (hdr=hdr@entry=0x41ef48, id=id@entry=0x2a0b1bd4) at objectImpl.c:145
      145 return &(buf->buf[buf->indexPtr[id->id - 1]]);
      (gdb) print id->id
      $15 = 698606012
      (gdb) print buf->indexPtr[id->id - 1]
      Cannot access memory at address 0xa6d17894

      When we tried to check what default properties were getting filled, we found out that the same property was getting integer as data type which is in compliance with the parent class but not the child defined by us which overrides it to a string.

      Breakpoint 16, __ift_setProperty (instance=instance@entry=0x29a1f7d0,
      name=0x2a30d300 "PercentComplete", value=0x0, type=144) at instance.c:350
      350 CMPIData data = { type, CMPI_goodValue, {0LL} };
      In course of the code flow, later this integer is set with a character value by our provider. This seems to be the root cause of the issue.

       

      Last edit: Chitrak Gupta 2014-10-24
      • P.Raveendra Reddy

        As discussed, i am sharing the stack trace using SFCC client.

        Here we are using a stub(sysinfoapp) that uses SFCC to get data from SFCB.

        Core was generated by `sysinfoapp -c DCIM_LifecycleJob'.
        Program terminated with signal 7, Bus error.
        #0 0x295807aa in ClObjectGetClString (hdr=hdr@entry=0x417ce0, id=id@entry=0x7bb7e9a4) at objectImpl.c:145
        145 return &(buf->buf[buf->indexPtr[id->id - 1]]);
        (gdb) where
        #0 0x295807aa in ClObjectGetClString (hdr=hdr@entry=0x417ce0, id=id@entry=0x7bb7e9a4) at objectImpl.c:145
        #1 0x29585548 in ClInstanceGetPropertyAt (inst=0x417ce0, id=4, data=0x7bb7e9a0, name=0x7bb7e9f8, quals=0x0) at objectImpl.c:2376
        #2 0x29575706 in __ift_internal_getPropertyAt (ci=ci@entry=0x417408, i=i@entry=4, name=name@entry=0x7bb7e9f8, rc=rc@entry=0x7bb7e9fc, readonly=readonly@entry=1, quals=quals@entry=0x0) at instance.c:229
        #3 0x29575d52 in __iftLocal_getObjectPath (instance=0x417408, rc=0x0) at instance.c:611
        #4 0x295aa95a in enum2xml (enm=0x416858, sb=0x416a00, type=4096, xmlAs=4, flags=1, httpHost=0x37a5e000 "H7(\b\243\067") at cimXmlGen.c:1249
        #5 0x0040244e in EnumerateClass (tempfd=3, classnameBuf=0x7bb7ef02 "DCIM_LifecycleJob", cc=<optimized out="">) at /usr/src/debug/sysinfoapp-1.0+git17+46acfa12227f9dce2b6bfe75c225f05cd353ffeb-r5.3/source/src/sysinfoapp.c:115
        #6 Enum (cc=cc@entry=0x416890, classnameBuf=classnameBuf@entry=0x7bb7ef02 "DCIM_LifecycleJob") at /usr/src/debug/sysinfoapp-1.0+git17+46acfa12227f9dce2b6bfe75c225f05cd353ffeb-r5.3/source/src/sysinfoapp.c:350
        #7 0x0040143c in main (argc=3, argv=0x7bb7ede4) at /usr/src/debug/sysinfoapp-1.0+git17+46acfa12227f9dce2b6bfe75c225f05cd353ffeb-r5.3/source/src/sysinfoapp.c:498
        (gdb) print id->id
        $1 = 698606012
        (gdb) print buf->indexPtr[id->id - 1]
        Cannot access memory at address 0xa6d1062c

        (gdb)</optimized>

         
  • Dave Heller

    Dave Heller - 2014-10-20
    • assigned_to: Chris Buccella --> Dave Heller
     
  • Dave Heller

    Dave Heller - 2014-11-03

    I have investigated this sufficiently to determine root cause, but I don't think what you are doing is supported by the CIM specification so I probably wont investigate any further.

    I have reproduced your crash by doing what it looks like you are doing: attempting to override a property of type "integer" with a property of type "string", by defining the override in the provider MOF, and coding the provider the return value of type CMPI_chars (or CMPI_string) instead of CMPI_uint16 for that property.

    It is true that at this point in the SFCB code, where it is manipulating the instance just returned by the provider (in objectImpl.c), we see the type that is actually set by the provider: in this case, CMPI_chars, rather than CMPI_uint16. But the data in fact is not in ClString format at that point, as it needs to be for proper handling of the data, and so we crash.

    So it is true that your MOF override is not working as you expect, and the property type defined in the base class is taking precedence.

    The type handing in question is governed by the classProvider, which is the entity reading the class definitions from the repository; and, as you know, the repository data comes from compiling the MOFs.

    The default class provider for SFCB 1.4 is classProviderSf. In this module there is a function called mergeParents(), and this is where the class hierarchy is reconciled for an EI query. This function calls cpyClass(), which handles the merging of properties and qualifiers from the base class all the way down the the class being queried. This function does not handle the type override, and this is the source of the issue. I was able to prove this by hacking the block of code in cpyClass() that does the "copy properties", to force a type override for a specific property I was using in my test. This prevented the crash and allowed the query to complete normally.

    If this was working in v1.3, it may be because it uses a different class provider by default: classProviderGz. Looking at that, the code that does the hierarchy reconciliation is sufficiently different from classProviderSf, that it's possible it is handling the type override as you are expecting. I did not look at this in detail, nor did I test it.

    I don't think this kind of override is supported by the spec. Per DMTF DSP0004, CIM Infrastructure Specification,, section 2.1 Definition of the Meta Schema:

    A Property can have an Override relationship with another Property from a different class. The Domain of the overridden Property must be a supertype of the Domain of the overriding Property. For non-Reference Properties, the type associated with the overriding Property MUST agree with (be the same as) the type of the overridden Property.
    

    This is basically saying: you can't override the type of a non-reference property. This makes sense, as the ramifications of supporting type overrides are far-reaching. It means every client would need to be coded to handle any possible return type for every property ("handle" at least in the sense of not crashing, if not handling in the direct sense). This would be overly burdensome, and is not required today.

    Reference properties (ie. for associations) are an exception, since these point to generic object paths and need to be "cast" to the specific class they refer to.

    If you'd like to try switching back to the old class provider, you can do this by simply s/ClassProviderSf/ClassProviderGz/ in your "providerRegister" file. But you will also need to make sure the repository is compiled in Gz format, which is no longer the default. There are some instructions how to do that here: http://sblim.sourceforge.net/wiki/index.php/SfcbClassProviders

    But even if that works, it won't be supported going forward. I suggest instead you create a new property to handle your string value, and name it like "PercentCompleteString", rather than attempting to override the property of the base class.

     
  • Chitrak Gupta

    Chitrak Gupta - 2014-11-04

    Thanks Dave for the detailed explanation. I see your point and agree that our property overrides are not according to the CIM specification (rather follows the notions of C++ polymorphism). I can see that the schema handling in 1.4.8 has changed considerably from 1.3.10 and that caused us to unearth the issue in our MOF definitions. Thanks a bunch for your help. It confirms our suspicion and were thinking in the same lines of redefining our MOFs.

     
    • Dave Heller

      Dave Heller - 2014-11-05

      np. glad to help. Thanks for bringing it forward

       
  • Chitrak Gupta

    Chitrak Gupta - 2014-11-04

    Dave,

    One thing that we would like to see from SFCB is to handle badly written MOFs. If the mofc can point out the flaws at the compile time it would help. Also during run time, we should handle it gracefully.

     
    • Dave Heller

      Dave Heller - 2014-11-05

      Agreed, ideally mofc should catch this sort of thing. I don't know if will be able get to it any time soon, though..

      As for SFCB handing it: ideally, it should too, but SFCB is light on these sort of checks due to the small-footprint nature of the program. (It's true embedded environments have more cpu+mem that they used to, but they have more stuff running so it's always constrained!) That said, it probably wouldn't be too hard to patch ClassProvider to catch this. I'll put it on my todo list. Contributed patches welcome. :-)

       
  • Dave Heller

    Dave Heller - 2014-11-26
    • status: open --> closed-wont-fix
     

Log in to post a comment.