Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#10 DbaseFileReader - reading international characters

open
nobody
None
5
2004-01-27
2004-01-27
Anonymous
No

DbaseFile international character reading incorrect

IO/DbaseFileReader.cs:168

tempObject = new string(sbuffer).Trim().Replace("\0","");

is wrong when encoding strings that contain
international characters outside of the printable ASCII
range.

Should be:

tempObject = System.Text.UTF7Encoding.UTF7.GetString
(sbuffer).Trim().Replace("\0","");

Keep up the good work!!

Stijn

Discussion

  • Logged In: YES
    user_id=948130

    I think you miss the point; the real problem is with UTF8
    reading, not the conversion which goes afterwards.
    The (proved!) solution is:

    IO/DbaseFileReader.cs:50

    _dbfStream = new BinaryReader(stream);

    change to:

    _dbfStream = new BinaryReader(stream,
    System.Text.Encoding.Default);

     
  • Andrew Coats
    Andrew Coats
    2004-10-09

    Logged In: YES
    user_id=168425

    I think the following code has fixed the problem.

    // when reading strings - use the encoding stuff otherwise

    // some characters cause a problem. e.g.
    character

    Encoding asciiEncoding = Encoding.Default;

    Encoding unicodeEncoding = Encoding.Unicode;

    byte[] unicodeBytes = Encoding.Convert
    (asciiEncoding,unicodeEncoding,asciiBytes);

    char[] unicodeChars = new char
    [unicodeEncoding.GetCharCount
    (unicodeBytes,0,unicodeBytes.Length)];

    unicodeEncoding.GetChars
    (unicodeBytes,0,unicodeBytes.Length,unicodeChars,0);

    string newString = new string(unicodeChars);

    tempObject = newString.TrimEnd();

    break;