Menu

#230 Documentation: for quex cconstructor for string

open
5
2012-10-02
2012-01-27
Clem Wang
No

You can find this declaration in the generated Lexer include file:

MyLexer(QUEX_TYPE_CHARACTER* BufferMemoryBegin, size_t  BufferMemorySize,
          QUEX_TYPE_CHARACTER* BufferEndOfContentP   = 0x0,
          const char*          CharacterEncodingName = 0x0,
          bool                 ByteOrderReversionF   = false);

but it's not described on page 17 of the PDF documentation (0.50.1) nor the online version:

http://quex.sourceforge.net/doc/html/basics/usage-scenarios.html

Also

ByteOrderReversionF parameter is not described for all the constructors (I really don't know what this is.)

InputCodingName parameter is not described for all the constructors. (I presume it's something like "utf-8")

I presume that BufferMemoryBegin points to the first byte of the string being passed to the Lexer (although for me, it seems like I should make the pointer be one character BEFORE the start, so maybe I'm doing something wrong or maybe the usage isn't what I'm expecting.)

It's not clear whether BufferEndOfContentP points to the last character of the buffer or one pass the last character of the buffer.

Discussion

  • Clem Wang

    Clem Wang - 2012-01-30

    New information about using the Quex string constructor which is not yet documented in 0.50.1 but I believe to be true. (Frank, feel free to correct my misunderstandings...)

    A.
    After discussing this with Frank, I discovered that if you use the String constructor, you need to put a new line ('\n') prior to the beginning of your string so that Quex can know that the start of your string will match beginning of a line.

    B. you need to set BufferEndOfContentP to point at one character past the last character of your string. If your string is simple ASCII and null terminated, then strlen() will point at the last character. If you have Unicode, you'll need to use a different length/size function since strlen may match any incidental '\0' that are part of a Unicode code. Using the default NULL ptr is a bad idea, as Quex will run off the end of the string and continue indefinitely.

    C. BufferMemorySize appears to be ignored, which seems like a bug. That is, if you pass 0 for BufferMemorySize, the Lexer continues on its merry way until caught by BufferEndOfContentP.

    If you are using the file or stream based Quex constructors, these measures aren't need because you can use the various stream functions to tell if you are at the beginning or end of the stream, which you can't tell with just a Unicode string buffer.

    This makes sense to me, but it's just not well documented (or maybe I missed the documentation.).

     
  • Frank-Rene Schäfer

    Did you read through http://quex.sourceforge.net/doc/html/buffer-access/intro.html ?
    Since I do not know the things I wrote by heart, it would take me
    some time to correlate it to your inputs. If not, you could do me a favor
    and relate your comments to the chapters in the documentation. This
    would make things easier for me.

    Anyway, thanks for reporting your problems. I will take it into consideration.

     
  • Clem Wang

    Clem Wang - 2012-01-31

    A problem (and what confused me) is the page describing the other Lexer constructors lives here:
    http://quex.sourceforge.net/doc/html/basics/usage-scenarios.html

    but unfortunately, this page on Usage Scenarios omits the String Constructor. This is where I excepted the String Constructor to be found. I suggest there should be a reference between these two pages.

    I will carefully read the buffer access page and add further comments once I digest this new information. I didn't realize that direct buffer access implied the string constructor, so I had only skimmed this chapter.

    There's a lot to learn!

    Thanks.

     

Log in to post a comment.