Menu

#72 Allow double quote as escape character AND as quote character

closed
None
5
2015-12-08
2015-09-21
Philippe
No

In RFC4180 (https://tools.ietf.org/html/rfc4180), it is stated :

If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:

   "aaa","b""bb","ccc"

So I created a CSVReader :
CSVReader csvReader = new CSVReader(new FileReader('C:\CSV_MetaData.csv'), ';' as char, '"' as char, '"' as char)

But I got the exception :

java.lang.UnsupportedOperationException: The separator, quote, and escape characters must be different!

Could you support having double quote as escape character AND as quote character.

Thanks.

Related

Feature Requests: #72

Discussion

  • Scott Conway

    Scott Conway - 2015-09-28
    • status: open --> closed
    • assigned_to: Scott Conway
     
  • Scott Conway

    Scott Conway - 2015-09-28

    I am marking this as closed because regardless of what you set the quotes and escape values to the presence of two quotes characters (that you define) back to back is treated as an escaped quote (per spec <BG>. The test below is the proof of that.

    @Test
    public void testADoubleQuoteAsDataElement() throws IOException {

        String[] nextLine = csvParser.parseLine("a,\"\"\"\",c");// a,"""",c
    
        assertEquals(3, nextLine.length);
    
        assertEquals("a", nextLine[0]);
        assertEquals(1, nextLine[1].length());
        assertEquals("\"", nextLine[1]);
        assertEquals("c", nextLine[2]);
    
    }
    
    So the standard CSVReader defaults should give you what you are looking for.
    
    If not file a defect and email me back and I will look at creating an test that matches your exact situation.
    
    Sincerely.
    
    Scott Conway :)
    
     
  • Tomáš Kraut

    Tomáš Kraut - 2015-12-07

    Hi Scott,
    what about following use case?
    When I want to use no escaping character at all, especially allow backslash (or any other character other than quote) as regular character at the end of quoted field.

    @Test
    public void testEscapeCharacterAtTheEndOfField() throws IOException {
    //        CSVParser otherParser = new CSVParser(',', '"', '"'); //java.lang.UnsupportedOperationException: The separator, quote, and escape characters must be different!
    //        CSVParser otherParser = new CSVParser(',', '"'); //java.io.IOException: Un-terminated quoted field at end of CSV line
        CSVParser otherParser = new CSVParser(',', '"', '\0'); //OK, but weird, what if I want null character there?
    
        String[] nextLine = otherParser.parseLine("a,\"b\\\",c");// a,"b\",c
        assertEquals("a", nextLine[0]);
        assertEquals(2, nextLine[1].length());
        assertEquals("b\\", nextLine[1]);
        assertEquals("c", nextLine[2]);
    }
    

    With Regards
    Tomáš Kraut

     

    Last edit: Tomáš Kraut 2015-12-07
    • Scott Conway

      Scott Conway - 2015-12-07

      I am not sure I fully understand a the question, if it is a question, but
      yes - opencsv does allow you to set the escape as a null character
      (character zero) and as long as there is not a null character in your csv
      file you are okay.

      By the standards have a separator, quotation, and an escape character.
      they are not something you can turn off. But you can set it to an
      "impossible" character hence the null if your file has backslashes and you
      don't want to escape them.

      Hope that helps.

      Sincerely

      Scott Conway :)

      On Mon, Dec 7, 2015 at 3:48 AM, "Tomáš Kraut" tkraut@users.sf.net wrote:

      Hi Scott,
      what about following use case?
      When I want to use no escaping character at all, especially allow
      backslash (or any other character other than quote) as regular character at
      the end of quoted field.

      @Test
      public void testEscapeCharacterAtTheEndOfField() throws IOException {
      // CSVParser otherParser = new CSVParser(',', '"', '"'); //java.lang.UnsupportedOperationException: The separator, quote, and escape characters must be different!
      // CSVParser otherParser = new CSVParser(',', '"'); //java.io.IOException: Un-terminated quoted field at end of CSV line
      CSVParser otherParser = new CSVParser(',', '"', '\0'); //OK, but weird, what if I want null character there?

      String[] nextLine = otherParser.parseLine("a,\"b\\\",c");// a,"b\"",c
      assertEquals("a", nextLine[0]);
      assertEquals(2, nextLine[1].length());
      assertEquals("b\\", nextLine[1]);
      assertEquals("c", nextLine[2]);
      

      }

      With Regards
      Tomáš Kraut


      Status: closed
      Group: Next Release (example)
      Created: Mon Sep 21, 2015 12:30 PM UTC by Philippe
      Last Updated: Mon Sep 28, 2015 03:44 AM UTC
      Owner: Scott Conway

      In RFC4180 (https://tools.ietf.org/html/rfc4180), it is stated :

      If double-quotes are used to enclose fields, then a double-quote
      appearing inside a field must be escaped by preceding it with
      another double quote. For example:

      "aaa","b""bb","ccc"

      So I created a CSVReader :
      CSVReader csvReader = new CSVReader(new FileReader('C:\CSV_MetaData.csv'),
      ';' as char, '"' as char, '"' as char)

      But I got the exception :

      java.lang.UnsupportedOperationException: The separator, quote, and escape characters must be different!

      Could you support having double quote as escape character AND as quote
      character.

      Thanks.

      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/opencsv/feature-requests/72/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Scott Conway
      scott.conway@gmail.com
      http://www.conwayfamily.name

       

      Related

      Feature Requests: #72

  • Tomáš Kraut

    Tomáš Kraut - 2015-12-08

    Hi Scott,
    thanks for an explanation.
    I am still just little confused, because from standard (https://tools.ietf.org/html/rfc4180) I have learned about separator and quote character but no escape character.
    So I assumed that escape character may be optional (or more specifically by just quickly reading the homepage of project (http://opencsv.sourceforge.net/) I haven't found escaped character at all, so I didn't expect that when I receive file where backslash will appear before closing quote mark it won't parse).
    For my use case (and hopefully for almost all use cases I could come up) null character as quote is OK.

    Tomáš

     
    • Scott Conway

      Scott Conway - 2015-12-08

      I need to go brush up on the standards because you may be right about the
      escape character (though I thought the rfc4180 mentioned escaping double
      quotes with itself). The backslash may just be a relic of how most
      computer programs parse non printable characters. But that said it is an
      integral part of opencsv that you cannot turn off but instead can set to
      null or some other character you know will not be in your data.

      Scott :)

      On Tue, Dec 8, 2015 at 2:13 AM, "Tomáš Kraut" tkraut@users.sf.net wrote:

      Hi Scott,
      thanks for an explanation.
      I am still just little confused, because from standard (
      https://tools.ietf.org/html/rfc4180) I have learned about separator and
      quote character but no escape character.
      So I assumed that escape character may be optional (or more specifically
      by just quickly reading the homepage of project (
      http://opencsv.sourceforge.net/) I haven't found escaped character at
      all, so I didn't expect that when I receive file where backslash will
      appear before closing quote mark it won't parse).
      For my use case (and hopefully for almost all use cases I could come up)
      null character as quote is OK.

      Tomáš

      Status: closed
      Group: Next Release (example)
      Created: Mon Sep 21, 2015 12:30 PM UTC by Philippe
      Last Updated: Mon Dec 07, 2015 09:48 AM UTC
      Owner: Scott Conway

      In RFC4180 (https://tools.ietf.org/html/rfc4180), it is stated :

      If double-quotes are used to enclose fields, then a double-quote
      appearing inside a field must be escaped by preceding it with
      another double quote. For example:

      "aaa","b""bb","ccc"

      So I created a CSVReader :
      CSVReader csvReader = new CSVReader(new FileReader('C:\CSV_MetaData.csv'),
      ';' as char, '"' as char, '"' as char)

      But I got the exception :

      java.lang.UnsupportedOperationException: The separator, quote, and escape characters must be different!

      Could you support having double quote as escape character AND as quote
      character.

      Thanks.

      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/opencsv/feature-requests/72/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Scott Conway
      scott.conway@gmail.com
      http://www.conwayfamily.name

       

      Related

      Feature Requests: #72


Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.