Menu

#93 Cannot parse quoted empty cells

v1.0 (example)
closed-works-for-me
None
5
2016-01-18
2013-03-13
Béla Boros
No

CSVReader cannot parse simpe CSV file with empty quoted cells properly:
"col1";"col2"
"";1
"";2

The third line is returned as ";2

JUnit test to reproduce the error:

@Test
public void testASingleQuoteAsDataElementWithEmptyField2() throws IOException {

StringBuilder sb = new StringBuilder(CSVParser.INITIAL_READ_SIZE);

sb.append("\"\";1").append("\n");// ;1
sb.append("\"\";2").append("\n");// ;2

CSVReader c = new CSVReader(new StringReader(sb.toString()), ';', '\"');

String[] nextLine = c.readNext();
assertEquals(2, nextLine.length);

assertEquals(0, nextLine[0].length());
assertEquals("1", nextLine[1]);

nextLine = c.readNext();
assertEquals(2, nextLine.length);

assertEquals(0, nextLine[0].length());
assertEquals("2", nextLine[1]);
}

I would like to read CSV files according to RFC4180 (http://tools.ietf.org/html/rfc4180).

My patch is attached:
- three quote mode controls the parsing: STRICT_QUOTE, TRICKY_QUOTE and RFC_4180
- the CSV file mentioned above can be parsed in RFC_4180 mode

thank you for your great software,
Béla Boros

Discussion

  • Béla Boros

    Béla Boros - 2013-03-13
     
  • Scott Conway

    Scott Conway - 2016-01-18
    • status: open --> closed-works-for-me
    • assigned_to: Scott Conway
    • Group: --> v1.0 (example)
     
  • Scott Conway

    Scott Conway - 2016-01-18

    The default settings for openCSV will take the empty double quote string as an escaped double quote (also per RFC_4180) so it then picks up the separator as part of the field.

    To get the results you want turn off strict quotes. The following test passes and is what I believe you want.

    @Test
    public void issue93ParsingEmptyDoubleQuoteField() throws IOException {
    CSVParserBuilder builder = new CSVParserBuilder();
    CSVParser parser = builder.withStrictQuotes(false).build();
    // "",2
    String[] nextLine = parser.parseLineMulti("\"\",2");

        assertEquals(2, nextLine.length);
    
        assertTrue(nextLine[0].isEmpty());
        assertEquals("2", nextLine[1]);
    }
    
     

Log in to post a comment.