Help save net neutrality! Learn more.

#72 CSVReader does not read GZIP streams correctly



I have run into a very weird behavior from CSVReader, when used with GZIP files, where it reads more data in the first field of each line than it should.

More specifically, say that we have a GZIP file, named "archive.tar.gz". A reader for that file can be constructed as

Reader isReader = new InputStreamReader(new GZIPInputStream(new FileInputStream(new File("archive.tar.gz"))));

or even as a BufferredReader, like

BufferedReader bufReader = new BufferedReader(isReader), where isReader is the object defined above.

Now, when I call isReader.readNext() or bufReader.readNext() from the command line of my Linux machine (actually, a CentOS 5.6 VMWare VM), the result is a String[], which has the correct number of fields and values everywhere, apart from the first field, where, strangely, the value that should have been read is prepended by the filename and some other magic information like user and group names.

For example, if the first file line where

"This, is, a, standard, CSV, line" (without the quotes)

in a file called "archive.tar.gz", of a user:group pair someuser:somegroup

the first field read in by readNext() is not "This", as one would expect, but actually
"archive.tar.gz<numbers...>someuser somegroupThis".

This behavior does not take place when I run the code from within Eclipse, where the expected value ("This") is returned.

In either case, the remaining fields are read correctly.

Is this a bug?


  • Scott Conway

    Scott Conway - 2011-07-24

    Received issue 3376587 from author. Issue was a mistake - was trying to read a .tar.gz file instead of a .gz file.

  • Scott Conway

    Scott Conway - 2011-07-24
    • status: open --> closed-invalid

Log in to post a comment.