Menu

#153 Additional pass at non-ascii support

open
5
2004-07-28
2004-07-27
Dave
No

Updates to support non-ascii platforms, but maintain
unicodeToAscii translations in log / script classes to
translate any unicode character in column / table names
and
also CHAR and derivative table columns to a locale-
independent 7 bit ASCII representation, and allow
database
file portability. The suggested changes should preserve
the
main objective to keep files compatible across
ASCII derivative locales

In general, the mechanisms for writing to
the .script and .log files did not take advantage of the
BufferedWriter/OutputStreamWriters that do appropriate
character translations in order to remain platform
independent. Also, specific unicode-to-ascii and vise-
versa conversions where being done, as well as invalid
casting of (char) to (byte) which was causing EBCDIC
platform failures.

Changes made include the addition of naming the ASCII
code
page to be used on read and write to the log/script files.
This is accomplished via specifying the "ASCII" character
set
name on construction of the OutputStreamWriter and
InputStreamReader. This occurs in ScriptReaderText and
ScriptWriterText. With this addition, there is no longer
the need to use the asciiToUnicode and unicodeToAscii
functions
currently in the codebase.

Specific changes were as follows:

org.hsqldb.lib.HsqlByteArrayOutputStream
- writeBytes(String)
- Replaced buf[count++] = (byte)s.charAt(i) with
System.arrayCopy to copy String bytes into byte
buffer
- Added write(char) method to replace write(int) when
writing chars to buffer.
On the EBCDIC platform, the casting of (char) to (byte)
causes invalid characters to be written to the files. The
suggested solution here may not be optimal performance-
wise.

org.hsqldb.rowio.RowOutputTextLog
- writeChar(String, int)
- Removed unicodeToAscii translation
- writing String.getBytes();
- writeString(String)
- Removed unicodeToAscii translation
- writing String.getBytes();
With the use of Reader/Writers, explicit translation
need not be done.

org.hsqldb.scriptio.ScriptReaderText
- readLoggedStatement()
- Removed unicodeToAscii translation
With the use of Reader/Writers, explicit translation
need not be done.

org.hsqldb.scriptio.ScriptWriterText
- Added imports for java.io.BufferedWriter and
java.io.OutputStreamWriter
- Added instance variable BufferedWriter writer;
- Added close() method to close writer
- openFile()
- creating BufferedWriter on OutputStreamWriter on
FileOutputStream
- Added method writeWithWriter() as common method
for writing output
- writeRow(int, Table, Object[])
- Now calling new writeWithWriter method
- writeLogStatement(String, int)
- Now calling new writeWithWriter method
- writeDeleteStatement(int, Table,Object[])
- Now calling new writeWithWriter method
- writeSequenceStatement(int, NumberSequence)
- Now calling new writeWithWriter method
Taking advantage of BufferedWritter's specific
designation
of the "ASCII" character set.

org.hsqldb.scriptio.ScriptWriterBase
- sync()
- Checking for fileStreamOut == null. Will be null with
ScriptWriterText changes.
- close()
- Checking for fileStreamOut == null. Will be null with
ScriptWriterText changes.
Changes to accomodate BufferedWriter.

We are very pleased that HSQLDB is provided as a pure
Java data
where we can take advantage of the platform
independence and non-
reliance on large-scale database vendors. It is very
important
to our product base to be able to embed the pure Java
database,
and have it work cross the multitude of platforms
supported.
It would be ideal for us not to have to maintain separate
codebases,
and synchronize our changes in order to support non-
ascii platforms.
We would like for you to consider reviewing and
integration our changes,
of course with our correspondence as needed, in the
core HSQLDB.

Regards,
Group One Software

Discussion

  • Dave

    Dave - 2004-07-27

    Patched files for non-ascii platform support.

     
  • Fred Toussi

    Fred Toussi - 2004-07-28

    Logged In: YES
    user_id=150940

    Thanks,

    All sounds good and detailed.

    I will review and hopefully incorporate your code prior to any
    major code changes in the relevant classes.

     
  • Fred Toussi

    Fred Toussi - 2004-07-28
    • assigned_to: dedmike --> fredt
     

Log in to post a comment.