Sorry, I guess I was a few days late in responding to Sunils questions so the
context was lost.
A few days ago I had posted some Java code that worked around some bugs in the
JDBC driver and/or underlying DB support for storing ISCII text in varchar or
text fields in SQL Server or Access.
The basic problem was that when the driver read characters in the range 128..255
(i.e. ISCII text), for some reason the high order bit became sign extended all
the way across - so that all ISCII chars were being represented as unicode chars
in the range #FFFF - #FF80.
So my code basically clips the out the extra #FF, then converts the ISCII
representation to standard Unicode codes for devanagri.
In reality if I wanted to be as clean as possible I should store all ISCII text
in binary fields, since the DBMS / Driver doesnt seem like it is meant to
support non-ASCII text encodings. But that would take away a lot of neccesary
searching, sorting and indexing facilities...
So basically the take-home point being an example of the kinds of difficulties
one could have when storing and retrieving indic text in standard DBs and that
being another possible thing we should take a look at...
Does anyone have any exp whether or not these kinds of problems happen with
other DBs (MySQL, PostGRE, Oracle, etc...)
--Tapan
----- Original Message -----
From: "Joseph Koshy" <jk...@Fr...>
To: "Tapan S. Parikh" <ta...@ya...>
Sent: Wednesday, February 06, 2002 4:23 PM
Subject: Re: [Indic-computing-devel] Notes
>
>
> Dear Tapan,
>
> > But really my main concern was to bring peoples attention to the
> > kinds of hassles that may come with using non-ascii char
> > representations in standard dbms packages and middleware, which is
> > another hurdle we will have to overcome...
>
> What kinds of hassles? Could you provide some context for your email
> to the list please?
>
> Regards,
> Koshy
> <jk...@fr...>
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
|