Menu

How to process a CSV With Soft returns in it

Help
Mark H.
2013-05-24
2013-05-28
  • Mark H.

    Mark H. - 2013-05-24

    I'm trying to read a CSV that falls way outside the realm of the RFC standards in that it includes unquoted soft returns as the LF character. Reading it by hand, it can be differentiated, as the end of a row in the file is a CRLF, but the multiline fields are broken at the LF character. Is there a way to read this without writing my own implementation of the CSV Reader? I have been instructed that I am to avoid doing pre-processing work on the file, and I cannot change the file that I am being provided with, as it's being generated from another system and provided to us via FTP.

     
  • James Bassett

    James Bassett - 2013-05-24

    Hi Mark,

    Can you attach an example of the CSV file please? Super CSV is able to handle both \r\n and \n as line terminators, so it should work. Have you tested it?

    Cheers,
    James

     
  • Mark H.

    Mark H. - 2013-05-28

    James,

    I have tested it through JUnit test cases and it fails - I've attached the CSV. I had a conference with the supplier of the CSV about the RFC standards and they're trying to accommodate my requests. In the mean time, I still have to deal with the issue. I attached a sample out of the CSV, and you'll need to open in Notepad++ or a similar app in order to see the distinctions. The problem occurs when trying to process line 5 (OrgID number 7). I know the problem is that it's a badly formatted file, as it uses LF characters in the middle of a column, with a CRLF to distinguish lines of data.

    Thanks!

    -Mark