Replication of this issue requires a big data file, the one I have discovered the issue with contains 13,229 lines. I have not included data as the file is 2.43 MB in size and the file contains copyrighted data, but any large file should yield the same result.
I construct a CSVReader and reads data with it like this:
CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream("file.csv"), "ISO-8859-1"), ';');
String[] line;
int count = 0;
reader.readNext();
while((line = reader.readNext()) != null)
{
System.out.println(line[0]);
count++;
}
The first line is expected to be dropped as it contains the field specification and not data. But that does not explain that count after the code has been run is 5,358 when there is 13,228 lines of data in the file.
What is the error you are getting? If it is an out of memory error then you can do is instead of creating a CSVReader create a CSVParser and then using a BufferedReader read the file line by line yourself and pass the individual lines into the CSVParser.
Does it always fail at 5358? If so is there something about line 5359? an incorrect number of columns or a badly formatted field? Try increasing the amount of memory for your application and see if that helps.
If not please give me an idea of the format of the file. I know there are over 13000 lines and all the columns are delimited with a semi-colon but how many columns are there?
closed for no response to question
I too faced the same issue. And I found the cause also. It was happening because of null character presence in the middle of rcord. After removing the \x00 unicode character i.e "null" character with the help of notepad ++ I was able to process my file sucessfully.