#48 \n vs " in opencsv

closed-wont-fix
None
5
2010-12-20
2010-04-26
Lenjoy
No

for example:
a csv file:
--------------
a,b,c,ddd"eee
f,g,h,"iii,jjj"
--------------

CSVReader csvreader = new CSVReader(new FileReader(csvfile));

the line will be parsed into
--------------
a,
b,
c,
ddd"eee
f,g,h,"iii,
jjj"
--------------

but in python csv, it's
--------------
a, b, c, ddd"eee
and
f, g, h, "iii,jjj"
--------------

it's expected.
that's \n vs " problem.

Discussion

  • Scott Conway

    Scott Conway - 2010-05-06

    I was able to dupe this in Unit test and agree this is definitely a bug. What is happening is we are incorrectly picking up the quote character in ddd"eee and thus end the field on the next quote (before the iii).

    below is the failing test.

    @Test
    public void testIssue2992134OutOfPlaceQuotes() throws IOException {
    StringBuilder sb = new StringBuilder(CSVParser.INITIAL_READ_SIZE);

    sb.append("a,b,c,ddd\"eee\nf,g,h,\"iii,\njjj\"");
    System.out.println(sb);

    CSVReader c = new CSVReader(new StringReader(sb.toString()));

    String[] nextLine = c.readNext();

    // assertEquals(4, nextLine.length);
    assertEquals("a", nextLine[0]);
    assertEquals("b", nextLine[1]);
    assertEquals("c", nextLine[2]);
    assertEquals("ddd\"eee", nextLine[3]);
    }

    as a short term workaround you can either change your quote character in the CSVReader or escape out the quote char. Here is a pass test where I escaped the quote character.

    @Test
    public void testIssue2992134OutOfPlaceQuotes() throws IOException {
    StringBuilder sb = new StringBuilder(CSVParser.INITIAL_READ_SIZE);

    sb.append("a,b,c,ddd\\\"eee\nf,g,h,\"iii,\njjj\"");
    System.out.println(sb);

    CSVReader c = new CSVReader(new StringReader(sb.toString()));

    String[] nextLine = c.readNext();

    // assertEquals(4, nextLine.length);
    assertEquals("a", nextLine[0]);
    assertEquals("b", nextLine[1]);
    assertEquals("c", nextLine[2]);
    assertEquals("ddd\"eee", nextLine[3]);
    }

     
  • malcolm davis

    malcolm davis - 2010-06-18

    Could there be another issue? I am having a similar problem. The problem persists after changing the default quote char.

    The following is some sample code:

    import java.io.IOException;
    import java.io.StringWriter;

    import au.com.bytecode.opencsv.CSVParser;
    import au.com.bytecode.opencsv.CSVWriter;

    public class TestCsvIO {

    public static void main(final String[] args) {

    String results = "";
    char MY_DEFAULT_QUOTE_CHARACTER = '^';

    try {
    StringWriter writer = new StringWriter();
    CSVWriter csvWriter = new CSVWriter(writer,
    CSVWriter.DEFAULT_SEPARATOR, MY_DEFAULT_QUOTE_CHARACTER);
    String[]nextLine = { "test", "fakeline", "more" };
    csvWriter.writeNext(nextLine);
    csvWriter.flush();
    csvWriter.close();

    results = writer.toString();
    } catch (IOException excep) {
    excep.printStackTrace();
    }

    System.out.println("writer: " + results);

    try {
    System.out.println("parsed fails:");
    CSVParser parser = new CSVParser(CSVWriter.DEFAULT_SEPARATOR,
    MY_DEFAULT_QUOTE_CHARACTER);
    String[]items = parser.parseLine(results);
    for(String item : items) {
    System.out.println(item);
    }
    } catch (IOException excep) {
    excep.printStackTrace();
    }

    try {
    System.out.println("parsed works:");
    CSVParser parser = new CSVParser();
    String[]items = parser.parseLine("\"test\",\"fakeline\",\"more\"");
    for(String item : items) {
    System.out.println(item);
    }
    } catch (IOException excep) {
    excep.printStackTrace();
    }
    }
    }

     
  • Scott Conway

    Scott Conway - 2010-12-20
    • assigned_to: nobody --> sconway
    • status: open --> closed-wont-fix
     
  • Scott Conway

    Scott Conway - 2010-12-20

    Guys - sorry its been seven months but work and family schedules have been hectic. I am closing this down as WAD (working as designed). After looking at the code and debugging the situation what is happening is that when we reach a quote we are saying that we are starting a quote within our parse string so everything afterwards including the new line is considered part of one field.

    When I escaped the quote everything worked as you were expecting.

    So change

    a,b,c,ddd"eee
    f,g,h,"iii,jjj"

    to

    a,b,c,ddd\"eee
    f,g,h,"iii,jjj"

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks