The default constructor for CSVWriter and CSVReader are inconsistent
Brought to you by:
aruckerjones,
sconway
The default constructor for CSVWriter and CSVReader are inconsistent in the default escape character. CSVWriter uses '"', however CSVReader uses "\". This means that if one writes a csv file with the default writer and the data contains a backslash the backslash will be written as a single backslash without an escape. When reading the reader will interpret this backslash as an escape character and will read invalid data.
Hello Jon - I am marking this one as won't fix as it is a bug that is not a bug but is really a bug.... This situation was created in the early days because of conflicting requests/requirements/bug entries in the early days and the issue arose that any change to correct one issue would cause others to raise bugs because their code, coded the way it worked previously, broke.
People complain to me about being too stuck on backwards compatibility but when you get to a point where you cannot upgrade a library because the newer version contains a change that breaks your current code and you don't have the time/manpower/inclination to modify your code to use the new libraries you truly appreciate backwards compatibility. But that is okay because we have coded ways to get you the desired consistency.
First and foremost do NOT use the default constructors. Use the CSVWriterBuiilder and CSVReaderBuilder to create the CSVWriter and CSVReader to use, which will give you more control of the settings. The builders were created when we got to the point that our objects had a dozen constructors to set all the possible different parameters.
If consistency is important to use then use the CSVParserBuilder (if you want the original openCSV logic) or the RFC4180ParserBuilder (If you want to follow the current specifications) to create a parser and pass that same parser into the CSVWriterBuilder (which will then create a CSVParserWriter) and CSVReaderBuilder to ensure you get consistent settings.
If you are relatively new to openCSV I would recommend you use the RFC4180ParserBuilder to build a parser, just using defaults, and pass that into the CSVWriterBuilder and CSVReaderBuilder.
I hope that helps.
Scott :)
That makes good sense. I have been using opencsv for quite some time and haven't looked at the documentation to realize that I should move from the constructors to the builders. I'll be updating my code to use them or the parser builder that you recommend. I too have compatibility to worry about, so I need to look carefully at the parser builder and make sure I won't have problems with existing data files.
It won't be that difficult. For simple apps it can be a direct replacement. There are a couple of examples at https://opencsv.sourceforge.net/#configuration or you can clone the repo and look at the unit tests. But basically replace your CSVReader constructor with
and the CSVWriter constructor call with
Note the change from CSVWriter to the interface ICSVWriter. That is because the CSVWriterBuilder can return either the orignal CSVWriter (which is giving you the headache) or a CSVParserWriter which uses a parser to handle the writing and by passing in the same or similarly constructed parser your problem will be solved .
Hope that helps :)