#40 Problem parsing escape chars

closed-invalid
Scott Conway
None
5
2009-12-10
2009-12-04
Anonymous
No

Csv entries containing escape characters are not parsed correctly.

If you have a csv line like:

"This is a \column","This is another"

the CsvReader will return an array like this:

["This is a column","This is another"]

Which is not the behavior expected. I'd expect this return value:
["This is a \column","This is another"]

The faulty behavior is also non consistent with the behavior of the CsvWriter.

I am adding a patch. The patch causes following tests to fail:
CsvParserTest.testIssue2859181()
CsvParserTest.returnPendingIfNullIsPassedIntoParseLineMulti()

To me, the probelm is not the patch but these tests.
For example the behavior expected by CsvParserTest.testIssue2859181() is not correct.
I don't know csv format specifications but if I import a csv with the following line with openoffice
\\=field2

I obtain

\=field2

and not

=field2

as you are expecting.

Regards

Fabrizio

Discussion

  • Patch for trunk

     
  • Scott Conway
    Scott Conway
    2009-12-10

    This is the same as 290311. I actually thought about the same fix for that one but had the same broken test.

    No the test are correct. I put the double slash in there so Java would interpret it as a single slash - otherwise Java would give me an "Illegal escape character in string literal" error. So the actual string being parsed is

    field1;\=field2;"""field3"""

    and the expected outputs are
    field1
    =field2
    "field3"

    Now in Java if there is an escape sequence but not a escapable character its an error and the whole parse would fail. In opencsv it was decided to take the high road and ignore the escape character.

    If you really want to put the escape character in your string you need to double escape it or just simply change the escape character when you instantiate the Parser or Reader.

     
  • Scott Conway
    Scott Conway
    2009-12-10

    • assigned_to: nobody --> sconway
    • status: open --> closed-invalid