Menu

#97 A cell value ending in \ (backslash) causes issues in the CSVReader

v1.0 (example)
closed-invalid
None
5
2016-01-18
2013-10-09
Matthew
No

If the value of a csv data cell is a backslash or if the value ends in a backslash, the CSVReader continues parsing until another backslash is found which results in multiple cells being treating as one cell.

Unit Test:
import static org.junit.Assert.assertEquals;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.util.ArrayList;
import java.util.List;

import org.junit.Test;

import au.com.bytecode.opencsv.CSVReader;
import au.com.bytecode.opencsv.CSVWriter;

public class DataReaderTest {

@Test //this one does not work with opencsv-2.3
public void defaultWriterDefaultReader() throws Exception {
    File file = new File("./tmp/testing.csv");
    file.getParentFile().mkdirs();
    CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)));
    writer.writeAll(getTestData());
    writer.close();
    CSVReader reader = new CSVReader(new FileReader(file));
    List<String[]> list = reader.readAll();
    reader.close();
    assertEquals(4, list.size());
}

@Test
public void customWriterDefaultReader() throws Exception {
    File file = new File("./tmp/testing.csv");
    file.getParentFile().mkdirs();
    CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)), ',', '"', '\\');
    writer.writeAll(getTestData());
    writer.close();
    CSVReader reader = new CSVReader(new FileReader(file));
    List<String[]> list = reader.readAll();
    reader.close();
    assertEquals(4, list.size());
}

@Test
public void defaultWriterCustomReader() throws Exception {
    File file = new File("./tmp/testing.csv");
    file.getParentFile().mkdirs();
    CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)));
    writer.writeAll(getTestData());
    writer.close();
    CSVReader reader = new CSVReader(new FileReader(file), ',', '"', '\0');
    List<String[]> list = reader.readAll();
    reader.close();
    assertEquals(4, list.size());
}

@Test
public void CustomWriterCustomReader() throws Exception {
    File file = new File("./tmp/testing.csv");
    file.getParentFile().mkdirs();
    CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)), ',', '"', '\\');
    writer.writeAll(getTestData());
    writer.close();
    CSVReader reader = new CSVReader(new FileReader(file), ',', '"', '\\');
    List<String[]> list = reader.readAll();
    reader.close();
    assertEquals(4, list.size());
}

private List<String[]> getTestData() {
    List<String[]> list = new ArrayList<String[]>();
    list.add(new String[] {"quote\"", "escape\\", "normal"});
    list.add(new String[] {"double \"quote\"", "middle \\escape", "regular"});
    list.add(new String[] {"typical", "end escape\\", "ordinary"});
    list.add(new String[] {"one", "two", "three"});
    return list;
}

}

Related

Bugs: #97

Discussion

  • Michael Peterson

    Since the OpenCSV project seems to be defunct and non-responsive for 2+years, I've created a "reboot" of it, called simplecsv, with some differences. I tried your tests above against it. simplecsv passes the tests, as long as you specify allowUnbalancedQuotes() (one of the options I added) to the third test above. I added those tests to the simplecsv CsvReaderTest. You can take a look here: https://github.com/quux00/simplecsv

     
    • Matthew

      Matthew - 2013-12-03

      Very nice, thank you. I'll take a look at it.

      On Tue, Dec 3, 2013 at 12:20 AM, Michael Peterson quux444@users.sf.netwrote:

      Since the OpenCSV project seems to be defunct and non-responsive for
      2+years, I've created a "reboot" of it, called simplecsv, with some
      differences. I tried your tests above against it. simplecsv passes the
      tests, as long as you specify allowUnbalancedQuotes() (one of the options I
      added) to the third test above. I added those tests to the simplecsv
      CsvReaderTest. You can take a look here:
      https://github.com/quux00/simplecsv


      • [bugs:#97] A cell value ending in \ (backslash) causes issues in the
        CSVReader*

      Status: open
      Created: Wed Oct 09, 2013 11:21 PM UTC by Matthew
      Last Updated: Wed Oct 09, 2013 11:21 PM UTC
      Owner: nobody

      If the value of a csv data cell is a backslash or if the value ends in a
      backslash, the CSVReader continues parsing until another backslash is found
      which results in multiple cells being treating as one cell.

      Unit Test:
      import static org.junit.Assert.assertEquals;

      import java.io.BufferedWriter;
      import java.io.File;
      import java.io.FileReader;
      import java.io.FileWriter;
      import java.util.ArrayList;
      import java.util.List;

      import org.junit.Test;

      import au.com.bytecode.opencsv.CSVReader;
      import au.com.bytecode.opencsv.CSVWriter;

      public class DataReaderTest {

      @Test //this one does not work with opencsv-2.3public void defaultWriterDefaultReader() throws Exception {
      File file = new File("./tmp/testing.csv");
      file.getParentFile().mkdirs();
      CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)));
      writer.writeAll(getTestData());
      writer.close();
      CSVReader reader = new CSVReader(new FileReader(file));
      List<String[]> list = reader.readAll();
      reader.close();
      assertEquals(4, list.size());}
      @Testpublic void customWriterDefaultReader() throws Exception {
      File file = new File("./tmp/testing.csv");
      file.getParentFile().mkdirs();
      CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)), ',', '"', '\');
      writer.writeAll(getTestData());
      writer.close();
      CSVReader reader = new CSVReader(new FileReader(file));
      List<String[]> list = reader.readAll();
      reader.close();
      assertEquals(4, list.size());}
      @Testpublic void defaultWriterCustomReader() throws Exception {
      File file = new File("./tmp/testing.csv");
      file.getParentFile().mkdirs();
      CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)));
      writer.writeAll(getTestData());
      writer.close();
      CSVReader reader = new CSVReader(new FileReader(file), ',', '"', '\0');
      List<String[]> list = reader.readAll();
      reader.close();
      assertEquals(4, list.size());}
      @Testpublic void CustomWriterCustomReader() throws Exception {
      File file = new File("./tmp/testing.csv");
      file.getParentFile().mkdirs();
      CSVWriter writer = new CSVWriter(new BufferedWriter(new FileWriter(file)), ',', '"', '\');
      writer.writeAll(getTestData());
      writer.close();
      CSVReader reader = new CSVReader(new FileReader(file), ',', '"', '\');
      List<String[]> list = reader.readAll();
      reader.close();
      assertEquals(4, list.size());}
      private List<String[]> getTestData() {
      List<String[]> list = new ArrayList<String[]>();
      list.add(new String[] {"quote\"", "escape\", "normal"});
      list.add(new String[] {"double \"quote\"", "middle \escape", "regular"});
      list.add(new String[] {"typical", "end escape\", "ordinary"});
      list.add(new String[] {"one", "two", "three"});
      return list;}

      }

      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/opencsv/bugs/97/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Bugs: #97

  • Patricia Goldweic

    I have a csv file which may contain unbalanced quotes within csv cells. I was hoping that I could use simplecsv to handle the parsing. However, when a cell contains a single double quote, the parser (which I've configured with the 'allowUnbalancedQuotes' option), merges the cell with the rest of the cells in the line.
    Specific example: (uses separator character: '|', output prints out the array of tokens for the given line)

    input: blah|this is a long name for this" record|blah2
    output: blah, this is a long name for this" record|blah2

    correct output: blah, this is a long name for this" record, blah2

    Is this happening because of the same problem underlying this bug report? (symptoms are similar). Is there a fix available? If not, is there a way to correctly parse the file using simplecsv? Thanks in advance for looking into this.

     
  • Scott Conway

    Scott Conway - 2015-09-01

    For the input you are showing you need to escape the inside quote. Either with the escape character (usually ) or an extra double quote.

     
  • Scott Conway

    Scott Conway - 2016-01-18
    • status: open --> closed-invalid
    • assigned_to: Scott Conway
     
  • Scott Conway

    Scott Conway - 2016-01-18

    Sorry it took so long to get to these I am still catching up. Fortunately Jaakov created 127 which is basically the same issue.

    The problem is that when you are creating the file you created a field escape\. which when it is written out becomes escape\ so what you have done is inadvertantly escaped the comma.

    Good luck.

    Scott Conway :)

     

Log in to post a comment.