Menu

Entering infinite loop when parsing recursive grammar

Get Help
2013-02-11
2013-03-20
  • Denis Froschauer

    I observed that EXIP loops infinitely on the declaration of a recursive grammar.

    This grammar has a recursive element 'obj', aka an obj can be embedded in an obj.

      <xs:complexType name="Obj">
        <xs:sequence>
          <xs:element ref="obj" maxOccurs="unbounded" minOccurs="0"/>
        </xs:sequence>
      </xs:complexType>
    
      <xs:element name="obj"     type="Obj"/>
    

    I added specific traces in EXIP :

    ../../src/grammarGen/src/treeTableToGrammars.c:1186 ENTER handleComplexTypeEl
    ../../src/grammarGen/src/treeTableToGrammars.c:1047 ENTER getComplexTypeProtoGrammar 0x8ffe3b8
    ../../src/grammarGen/src/treeTableToGrammars.c:856 ENTER getContentTypeProtoGrammar
    ../../src/grammarGen/src/treeTableToGrammars.c:1288 ENTER getSequenceProtoGrammar
    ../../src/grammarGen/src/treeTableToGrammars.c:1306 iterator
    ../../src/grammarGen/src/treeTableToGrammars.c:456 ENTER handleElementEl 0x902a4b0
    ../../src/grammarGen/src/treeTableToGrammars.c:556 case ref
    ../../src/grammarGen/src/treeTableToGrammars.c:456 ENTER handleElementEl 0x8ffeb08
    ../../src/grammarGen/src/treeTableManipulate.c:326 ENTER getTypeQName typeLiteral=Obj
    ../../src/grammarGen/src/treeTableManipulate.c:409 search keys are http://obix.org/ns/schema/1.0 Obj
    ../../src/grammarGen/src/treeTableToGrammars.c:1047 ENTER getComplexTypeProtoGrammar 0x8ffe3b8
    ../../src/grammarGen/src/treeTableToGrammars.c:856 ENTER getContentTypeProtoGrammar
    ../../src/grammarGen/src/treeTableToGrammars.c:1288 ENTER getSequenceProtoGrammar
    ../../src/grammarGen/src/treeTableToGrammars.c:1306 iterator
    ../../src/grammarGen/src/treeTableToGrammars.c:456 ENTER handleElementEl 0x902a4b0
    ../../src/grammarGen/src/treeTableToGrammars.c:556 case ref
    ../../src/grammarGen/src/treeTableToGrammars.c:456 ENTER handleElementEl 0x8ffeb08
    ../../src/grammarGen/src/treeTableManipulate.c:326 ENTER getTypeQName typeLiteral=Obj
    ../../src/grammarGen/src/treeTableManipulate.c:409 search keys are http://obix.org/ns/schema/1.0 Obj
    ../../src/grammarGen/src/treeTableToGrammars.c:1047 ENTER getComplexTypeProtoGrammar 0x8ffe3b8

    ... infinite loop

    So, I try to understand the internal structures of EXIP, but its really not easy. May be introducing an 'anti-loop' mechanism ?
    Thx

     
  • Rumen Kyusakov

    Rumen Kyusakov - 2013-02-11

    Hi Denis,

    I am aware of this bug but still don't have a good solution to it. I mean I have something in mind that I really don't like - adding one more stage to the grammar generation process specifically for late resolution of the problematic
    <xs:element ref="element-containing-recursive-declaration" ...>

    type of elements.

    In the rare cases that I need to deal with this kind of elements I use a manual fix for now but is very troublesome.

    But I agree, that needs to be solved in a nice way - any suggestions are very much appreciated.

    Best regards,
    Rumen

     

    Last edit: Rumen Kyusakov 2013-02-11
  • Denis Froschauer

    Hi Rumen,
    (This is just a try. Say me if its enough or I'm in the wrong way, because I don't really understand all the EXIP code)
    To avoid loops, I have added a work flag in
    struct TreeTableEntry {
    ...
    unsigned char handling;
    }

    Then, I added a test in printTableEntry and handleElementEl :


    void printTreeTableEntry(TreeTableEntry treeTableEntryIn, int indentIdx, char prefix) {
    if (treeTableEntryIn->handling)
    return;
    treeTableEntryIn->handling = 1;
    ... legacy code
    treeTableEntryIn->handling = 0;
    }

    static errorCode handleElementEl(BuildContext* ctx, TreeTable* treeT, TreeTableEntry* entry, unsigned char isGlobal, Index* grIndex) {
            if      (treeTableEntryIn->handling)
                    return;
            treeTableEntryIn->handling = 1;
    ... legacy code
            treeTableEntryIn->handling = 0;
    }
    

    With that, there is no more loop.
    But I can't be sure the generated EXIPSchema is ok. Did it ?

    Certainly not, because exipd crashes. gdb :

    Program terminated with signal 11, Segmentation fault.
    #0  0x40031df4 in processNextProduction (strm=0xbf940c38, nonTermID_out=0xbf940be4, handler=0xbf940cc4, app_data=0xbf940d9c)
        at ../../src/grammar/src/grammars.c:495
    495     if(strm->context.currNonTermID >=  strm->gStack->grammar->count)
    (gdb) where
    #0  0x40031df4 in processNextProduction (strm=0xbf940c38, nonTermID_out=0xbf940be4, handler=0xbf940cc4, app_data=0xbf940d9c)
        at ../../src/grammar/src/grammars.c:495
    #1  0x40039193 in parseNext (parser=0xbf940c38) at ../../src/contentIO/src/EXIParser.c:138
    #2  0x08048f09 in main (argc=5, argv=0xbf941814) at ../../examples/simpleDecoding/decodeTestEXI.c:200
    (gdb) p *strm->gStack
    $1 = {grammar = 0x8b4a658, lastNonTermID = 4294967295, nextInStack = 0x873a380}
    (gdb) p *strm->gStack->grammar
    Cannot access memory at address 0x8b4a658
    

    Can you send me your feedback about my researches.
    Thank you

    Denis

     
  • Rumen Kyusakov

    Rumen Kyusakov - 2013-02-15

    Hi Denis,

    Your approach is very similar to how I fix this problem manually but is very ingenious in detecting only the cases that causes the loops. Let me first describe the issue a bit:
    It all starts with <xs:element ref="element-containing-recursive-declaration" ...>
    When the code in treeTableToGrammars.c handleElementEl() starts processing such element it has to perform two things:
    1) create a grammar production SE(qname)
    2) link the grammar that describes this element to that production. When such grammar does not exists it tries to build it. But in order to build it it ends up at the same <xs:element ref="element-containing-recursive-declaration" ...>
    due to the recursive declaration.
    Note that the recursive loops might be way more complex than your example - the referenced element might be of different type that is somehow related to the initial type using for e.g. extension, restriction etc. That is why I think your approach is very good at detecting all kinds of loops not only simple ones.

    What you do is to break the second try to process <xs:element ref="element-containing-recursive-declaration" ...> which is a way to detect that the program is in a loop.

    When you break the loop with:

    if(treeTableEntryIn->handling)
    return;

    you leave the SE(qname) unlinked to the grammar. That is, when the state machine gets to SE(qname) it does not not what is the next state.
    When breaking the loop, the SE(qname) production must be stored, and at the end of processing of the schema the grammar for it linked.
    This second part I do manually - using exipg text output you can find out the grammarId for the <xs:element ref="..." and manually assign it.

    You can have a look at the latest code in the SVN:
    https://exip.svn.sourceforge.net/svnroot/exip

    Search for COMMENT #SCHEMA# in treeTableToGrammars.c

    I will look at that issue again but don't have much time at the moment.

    // Rumen

     
  • Rumen Kyusakov

    Rumen Kyusakov - 2013-03-15

    Hi Denis,

    Can you confirm that this issue is solved with the latest code from the SVN so that I can close Bug #6?

    It works for my use cases but just want to check that it pass your tests too.

    Note that the debug outputs are still not fixed so when in DEBUG_GRAMMAR_GEN == ON && EXIP_DEBUG_LEVEL == INFO you still get infinite loop but I don't plan to fix that now.

    // Rumen

     
  • Denis Froschauer

    Retrieved svn revision 276.
    loopDetection works, EXIP encodes the document in schema informed.
    But then the exi document can't be decoded. (details offline)

     

    Last edit: Denis Froschauer 2013-03-19
  • Denis Froschauer

    Bug #6 can be closed. Congratulations !

     

Anonymous
Anonymous

Add attachments
Cancel