Menu

Parsing very large tokens

Zemantics
2013-09-15
2013-09-16
  • Zemantics

    Zemantics - 2013-09-15

    First of all, thanks for a great project. I've written a Turtle file parser using QUEX. Potentially, or in an edge case, the tokens in a Turtle file could be very large - even gigabytes if someone wanted to store a BLOB or something like that. Is there a way of parsing the token in smaller bits, ex. 4kb at a time - like a stream or something to prevent exhausting internal memory resources?

     

    Last edit: Zemantics 2013-09-15
  • Frank-Rene Schäfer

    Can you try to use the 'accumulator' in a dedictated mode?
    You might have to break your token-pattern into three:

      BEGIN, REPEAT and/or END
    

    Then, for example:

      mode GENERAL : { 
          {BEGIN}   GOSUB EAT; } 
      }
      mode EAT : { 
          {REPEAT} { accumulate(lexeme) } 
          \Not{{REPEAT}} GOUP() } 
      }
    

    or, replace REPEAT with \Not{END}.

     

Log in to post a comment.