CSVReader readNext() is not reading furthermore rows when i have...
Brought to you by:
aruckerjones,
sconway
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import com.opencsv.CSVParser;
import com.opencsv.CSVParserBuilder;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;
import com.opencsv.exceptions.CsvMalformedLineException;
public class test {
public static final String END_OF_PAGE = "- End of Page -"; public static void main(String a[]) { String path="D:\\Issue\\file1.txt"; try { FileReader fr = new FileReader(new File(path)); CSVParser parser = new CSVParserBuilder() .withSeparator('\t') .withIgnoreQuotations(false) .build(); CSVReader reader = new CSVReaderBuilder(new BufferedReader(fr)) .withSkipLines(0) .withCSVParser(parser) .build(); String[] colHeaders = reader.readNext(); String[] currentDataRow = reader.readNext(); int i=1; while(i==2){ try { System.out.println("assigning to next row"); currentDataRow = reader.readNext(); }catch(CsvMalformedLineException e) { try { currentDataRow=reader.readNext(); }catch(CsvMalformedLineException e2) { try { currentDataRow=reader.readNext(); } catch(CsvMalformedLineException e3) { System.out.println("Error reading file inside a loop and continuing with the rest"); currentDataRow=reader.readNext(); continue; } } } i=i+1; } }catch(Exception e) { System.out.println("exception "); } }
}
I'm trying to read the file in which one of record have open double quotes. Incase of failure, i would like to skip those row and continue with the rest. but readNext() is always giving me the same row and its not advancing the cursor position. Im using latest opencsv 4.6.
please find the file in attachments.
Please advise
Anil - this is working as designed and your attached file is a perfect example of why we cannot read ahead after such a failure.
In opencsv, and most csv libraries, you can have multiple lines in a single field as long as they are contained within quotes. So when the field starts with a quote we keep reading until we get to the end of the line or the end quote. If we get to the end of the line and there is no end quote we assume this is a multi line field and read the next line. If we read the entire file, or hit our line limit, and there is no end quote we throw the exception.
We cannot keep reading after that for one of two reasons: 1) if we are using defaults we are at the end of the file (which was your situation). 2) If the line limit is set we don't know if it was malformed and we read other legimate rows as part of our line OR it may have failed because the data may have had just one more row that our limit and if we read that as a new row it will be corrupt.
So in short if you see that exception you either have hit the end of the file and the data in the buffer is malformed OR you are in such a unrecoverable state and to keep on reading will make the situation worse.
Scott :)
Thank you very much Scott.. Great explanation