#49 quote char is ignored when quote char=escape char

closed-rejected
None
5
2010-12-21
2010-07-16
No

To recreate:

Create a parser with delimiter=, quoteChar=" escape="

Consider the line:
a,"b1,b2",c

This parses as the 4-element array {"a", "b1","b2", "c"} but should parse as the three element array {"a","b1,b2", "c"}.

Looking at CSVParser$parseLine, the issue is that each character is first tested as an escape character. If it is an escape character, then it gets thrown out, whether or not the character actually escapes anything.

Discussion

  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous - 2010-08-03

    I just ran into this and find it to be crippling since the CSV standard escape character is in fact double-quote. See http://tools.ietf.org/html/rfc4180

     
  • Campbell Moss

    Campbell Moss - 2010-08-18

    Change CSVParser.java, line 206 from:
    if (c == this.escape) {
    to:
    if (c == this.escape && c != quotechar) {

    I've created a patch for this fix and a test that tests it, but couldn't figure out how to submit the patch.

     
  • Geoffrey Cooney

    Geoffrey Cooney - 2010-11-24
    • summary: escape char is ignored when quote char=escape char --> quote char is ignored when quote char=escape char
     
  • Geoffrey Cooney

    Geoffrey Cooney - 2010-11-24

    I've uploaded a patch for this that I believe fixes the issue without introducing any new ones.

     
  • Scott Conway

    Scott Conway - 2010-12-21

    Hello Geoffrey.

    Sorry to be heavy handed about this but the delimeter, quote, and escape characters should never be the same. Following the principle of least astonishment I think its better to not allow it than to worry about if escape has precedence over quote.

    As such I modified the CSVParser constructor to throw an unsupported operation exception if they are the same. Since it is an runtime exception there is no change to the interface.

    :)

     
  • Scott Conway

    Scott Conway - 2010-12-21
    • assigned_to: nobody --> sconway
    • status: open --> closed-rejected
     

Log in to post a comment.