#28 Failed to parse file with length 8193 bytes

open
nobody
Other (7)
5
2006-04-13
2006-04-13
Anonymous
No

When parsing a file with 8193 bytes length i get this
error:

Exception in thread "main"
java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at
com.bluecast.xml.PiccoloLexer.yy_refill(Unknown Source)
at com.bluecast.xml.PiccoloLexer.yylex(Unknown
Source)
at com.bluecast.xml.Piccolo.yylex(Unknown Source)
at com.bluecast.xml.Piccolo.yyparse(Unknown Source)
at com.bluecast.xml.Piccolo.parse(Unknown Source)

I reproduced this error with two different xml files
(same length). If i add or remove some random byte
(without invalidating the xml) it works.
With the standard java parser there is no error.

Discussion

  • Nobody/Anonymous

    Logged In: NO

    I am also encountering this issue. Were you able to find a
    solution?

     
  • Nobody/Anonymous

    Logged In: NO

    The issue appears to be that yy_refill() can very well
    return with 0 characters read.

    For instance the file might end with CRLF, with the
    LF at byte 8193, which will return first with 0 characters
    read, then EOF. That's because the internal InputStream
    has a buffer of 8192 bytes.

    The fix would consist of a loop to read
    while (numRead == 0).

     
  • Nobody/Anonymous

    Logged In: NO

    Poster of the bug:

    No i was not able to find a solution...
    since the PiccoloLexer code is automaticaly generated on
    build (at least as i understood it..:)... and lacking the
    time to dive into its generation and find a real
    solution..:( i had to remove the piccolo parser from my
    project...

    but in general you might get the sources and build the
    parser yourself...
    then have a look at the generated sources, find the
    described place and change it... rebuild without
    regenerating the sources... debug...:) and in the end you
    will have a private piccolo parser...

     
  • Wolfgang Rether

    Wolfgang Rether - 2011-04-14

    Proposed solution (while (numRead == 0)) would probably lead to an infinite loop with a Windows CRLF on the buffer boundary ... better solution (currently working fine for me) would be:

    if (numRead <= 0) {
    return true;
    }

    instead of

    if (numRead < 0) {
    return true;
    }

    Note the '<=' replacing the '<'...

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks