From: <no...@so...> - 2002-01-13 15:22:15
|
Bugs item #499390, was opened at 2002-01-04 05:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=104091&aid=499390&group_id=4091 Category: IO Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vladimir Kornyshev (gnuzzz) >Assigned to: James Macgill (jmacgill) Summary: Troubles with data encoding Initial Comment: in package uk.ac.leeds.ccg.dbffile: when You read a row from a dbf-file, You do something like this (for example, merthod GrabNextDbfRec() in class Dbf): private StringBuffer GrabNextDbfRec()throws java.io.IOException{ StringBuffer record = new StringBuffer (rec_size+numfields); for(int i=0;i< rec_size;i++){ // we could do some checking here. record.append((char)dFile.readUnsignedByte()); } return record; } That's wrong, because when You converts byte into char, if Your data isn't in "ISO-8859-1" encoding (and, for example, in "Cp1251"), Your data will be lost. You need to do like this: private StringBuffer GrabNextDbfRec()throws java.io.IOException{ ByteArrayOutputStream baos = new ByteArrayOutputStream(rec_size+numfields); for(int i = rec_size - 1; i >= 0; i--){ // we could do some checking here. baos.write(dFile.readUnsignedByte()); } return new StringBuffer(baos.toString("Cp1251")); } Best regards, Kornyshev Vladimir ---------------------------------------------------------------------- >Comment By: James Macgill (jmacgill) Date: 2002-01-13 07:22 Message: Logged In: YES user_id=9731 I have figured out what needed to be changed and duplicated the modification for the GrabNextDbfRec and other methods. This may have fixed your problem, could you try your files again and let me know if there is still a problem? James ---------------------------------------------------------------------- Comment By: James Macgill (jmacgill) Date: 2002-01-09 13:40 Message: Logged In: YES user_id=9731 There were some modifications made a year ago to support 2byte character sets so that it could read Japanese dbf files. These are not ISO-8859-1, but still read fine. However, the changes do not appear to have been made to every method in DbfFile, notably GrabNextDbfRec does not use the modification. I am not sure myself exactly what was involved in the modification, but I will contact the original author and see if the same patch should be applied to the GrabNextDbfRec method as well. Many thanks for bringing this up James ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=104091&aid=499390&group_id=4091 |