DbaseFile international character reading incorrect
IO/DbaseFileReader.cs:168
tempObject = new string(sbuffer).Trim().Replace("\0","");
is wrong when encoding strings that contain
international characters outside of the printable ASCII
range.
Should be:
tempObject = System.Text.UTF7Encoding.UTF7.GetString
(sbuffer).Trim().Replace("\0","");
Keep up the good work!!
Stijn
Logged In: YES
user_id=948130
I think you miss the point; the real problem is with UTF8
reading, not the conversion which goes afterwards.
The (proved!) solution is:
IO/DbaseFileReader.cs:50
_dbfStream = new BinaryReader(stream);
change to:
_dbfStream = new BinaryReader(stream,
System.Text.Encoding.Default);
Logged In: YES
user_id=168425
I think the following code has fixed the problem.
// when reading strings - use the encoding stuff otherwise
// some characters cause a problem. e.g.
character
Encoding asciiEncoding = Encoding.Default;
Encoding unicodeEncoding = Encoding.Unicode;
byte[] unicodeBytes = Encoding.Convert
(asciiEncoding,unicodeEncoding,asciiBytes);
char[] unicodeChars = new char
[unicodeEncoding.GetCharCount
(unicodeBytes,0,unicodeBytes.Length)];
unicodeEncoding.GetChars
(unicodeBytes,0,unicodeBytes.Length,unicodeChars,0);
string newString = new string(unicodeChars);
tempObject = newString.TrimEnd();
break;