Menu

#126 Single " in line results in different reading to excel/open office

v1.0 (example)
closed-fixed
None
5
2017-03-04
2016-02-04
Code Buddy
No

I've got a couple of files that open ok in Excel and open office, reading each line in the attached csv files into a new row and cell.

In example1.csv opencsv reads only the first 2 of the 5 lines. In example2.csv it reads all the lines, but combines lines 3, 4 & 5.

Here's the code I'm using to do this:

    public void testExample2() throws IOException, URISyntaxException
    {
        final File file = new File(this.getClass().getResource("/example2.csv").toURI().getPath());
        final CSVReader reader = new CSVReader(new FileReader(file));
        final List<String[]> all = reader.readAll();

        int row = 1;
        for (String[] strings : all)
        {
            for (int i = 0; i < strings.length; i++)
            {
                int col = 1;
                System.out.println("[" + row + "][" + col + "]: " + strings[i]);
                ++col;
            }
            ++row;
        }
        reader.close();
    }
2 Attachments

Related

Bugs: #126

Discussion

  • Code Buddy

    Code Buddy - 2016-02-04

    The outputs are:

    example1.csv:

    [1][1]: line1
    [2][1]: line2
    

    example2.csv:

    [1][1]: line1
    [2][1]: line2
    [3][1]: line3
    line4
    line5
    
     
  • Code Buddy

    Code Buddy - 2016-02-04

    Screenshots to show how the files load in excel.

     
  • Code Buddy

    Code Buddy - 2016-02-04

    If you need any other info just shout!

     
  • Tushar Shinde

    Tushar Shinde - 2016-06-21

    We faced the similar issue while parsing one of the csv files having double quote (") as a part of value -
    for example - ABC 26" DE as shown below -

    6/16/2016,6/16/2016,XYX,ABC 26" DE,0,PQR,1.00,187

    We were using the "parseLine" method of CSVParser class (package au.com.bytecode.opencsv. CSVParser). We did a small modification in this method and it worked. Below is the code snippet for the same.

    1) inQuotes = !inQuotes; --> placed below after if loop
    2) inQuotes=true;

    Let's know if this works for you.

    for (int i = 0; i < nextLine.length(); i++) {

            char c = nextLine.charAt(i);
            if (c == this.escape) {
                if (isNextCharacterEscapable(nextLine, (inQuotes && !ignoreQuotations) || inField, i)) {
                    sb.append(nextLine.charAt(i + 1));
                    i++;
                }
            } else if (c == quotechar) {
                if (isNextCharacterEscapedQuote(nextLine, (inQuotes && !ignoreQuotations) || inField, i)) {
                    sb.append(nextLine.charAt(i + 1));
                    i++;
                } else {
                    //inQuotes = !inQuotes; --> placed below after if loop
    
                    // the tricky case of an embedded quote in the middle: a,bc"d"ef,g
                    if (!strictQuotes) {
                        if (i > 2 //not on the beginning of the line
                                && nextLine.charAt(i - 1) != this.separator //not at the beginning of an escape sequence
                                && nextLine.length() > (i + 1) &&
                                nextLine.charAt(i + 1) != this.separator //not at the   end of an escape sequence
                                ) {
    
                            if (ignoreLeadingWhiteSpace && sb.length() > 0 && isAllWhiteSpace(sb)) {
                                sb = new StringBuilder(INITIAL_READ_SIZE);  //discard white space leading up to quote
                            } else {
                                sb.append(c);
                                // need to set inQuotes to true, so that values after "
                                // are not considered as next field value.
                                inQuotes=true; 
                            }
    
                        }
                    }
                    inQuotes = !inQuotes;
                }
                inField = !inField;
            } else if (c == separator && !(inQuotes && !ignoreQuotations)) {
                tokensOnThisLine.add(sb.toString());
                sb = new StringBuilder(INITIAL_READ_SIZE); // start work on next token
                inField = false;
            } else {
                if (!strictQuotes || (inQuotes && !ignoreQuotations)) {
                    sb.append(c);
                    inField = true;
                }
            }
        }
    
     

    Last edit: Tushar Shinde 2016-06-21
  • Scott Conway

    Scott Conway - 2016-06-21
    • assigned_to: Scott Conway
     
  • Scott Conway

    Scott Conway - 2016-06-21

    I am looking at creating a RFC-4180 compliant reader for Bug #127 and I believe this will fix this issue as well. If anything I will use the two examples you provided as additional test data.

     
  • Aditya

    Aditya - 2016-12-06

    I am still facing this issue with CSVReader. Are there any plans to fix this?

     
    • Scott Conway

      Scott Conway - 2016-12-13

      Yes - it is still in the works. Just slow going with family and holidays.

      On Tue, Dec 6, 2016 at 1:52 PM, Aditya adityashiva@users.sf.net wrote:

      I am still facing this issue with CSVReader. Are there any plans to fix
      this?


      Status: open
      Group: v1.0 (example)
      Created: Thu Feb 04, 2016 09:30 AM UTC by Code Buddy
      Last Updated: Tue Jun 21, 2016 12:45 PM UTC
      Owner: Scott Conway
      Attachments:

      I've got a couple of files that open ok in Excel and open office, reading
      each line in the attached csv files into a new row and cell.

      In example1.csv opencsv reads only the first 2 of the 5 lines. In
      example2.csv it reads all the lines, but combines lines 3, 4 & 5.

      Here's the code I'm using to do this:

      public void testExample2() throws IOException, URISyntaxException
      {
          final File file = new File(this.getClass().getResource("/example2.csv").toURI().getPath());
          final CSVReader reader = new CSVReader(new FileReader(file));
          final List<String[]> all = reader.readAll();
      
          int row = 1;
          for (String[] strings : all)
          {
              for (int i = 0; i < strings.length; i++)
              {
                  int col = 1;
                  System.out.println("[" + row + "][" + col + "]: " + strings[i]);
                  ++col;
              }
              ++row;
          }
          reader.close();
      }
      

      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/opencsv/bugs/126/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Scott Conway
      scott.conway@gmail.com
      http://www.conwayfamily.name

       

      Related

      Bugs: #126

  • Scott Conway

    Scott Conway - 2017-03-04
    • status: open --> closed-fixed
     
  • Scott Conway

    Scott Conway - 2017-03-04

    Fixed with the RFC4180Parser in the 3.9 release.

     

Log in to post a comment.

MongoDB Logo MongoDB