Thread: [Firebird-net-provider] Character set troubles

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Carlos, Oleg,

I've subscribed from my primary email now, to better keep up
with the discussions.

Disclaimer paragraph: (If reading disclaimers is boring for you, please
skip). 1. I must re-state that my exposure to .net is rather limited
(only writing small test programs, only using mono, speed-reading tons
of articles on MSDN. 2. I'll have a busy week with my day job, and
if time is left for FB, I'll have to look into the "index key too big 
(174)" issue, as I'm more competent there.

Carlos wrote:
> the supported character sets are "hardcoded" 
> into the .net data provider, if you have a custom 
> character set and it's supported by .net you need 
> to modify the .net provider source for add
> the character set definition and build the sources,
[following long list of supported charsets}

O.K. I don't know how to say it diplomatically, but I very much
assume that all the effort for client side charset conversion is
wasted in the long run (and maybe dangerous for the few cases,
where FB and CLR disagree about conversion. It may be a necessary
evil for now, if always connecting using Unicode can not be made
working with FB1.5.

I wrote:
>> (a) why should the user be given the option to connect with different
>> character sets, as it in effect this will only gives multiple code
>> paths in
>> the data provider, without benefit for the user.

Carlos wrote:
> I'm not sure if i understand well this, sorry, at this moment 
> you can give one character set on the connection string for 
> make the connection, field character set are handled internally.

Why specifying connection charsets from the user is a
strange approach:

The field character sets are (hopefully) choosen by the 
person designing the database to match his business
requirements: 
1. Character repertoire needed
2. Efficency
3. Availability of collation
4. Compatibility with legacy app or data.

The character set used in the .NET app is fixed, it's
always UTF-16 - correct me if I got this wrong. Is there
any valid reason, to manipulate the data inside the
.NET app in any other representation?

So given the field charset F, the connection charset C,
there will be transformation
F => C => UTF-16
and 
UTF-16 => C => F

So using different Cs, doesn't buy you anything.

In theory, C should be some flavour of Unicode,
but as your tests seem to indicate, this doesn't
work allways.

The only other reasonble alternative, would be 
C == F, but then you have the client side transcoding.
And it won't work with different F's.

So at this time, I cannot offer a truly optimistic perspective.

In the long term it must become possible to connect
using UTF-16, and then the .NET data provider
can be significantly streamlined.

What you can do in the meantime, I'm not competent
to comment.

Regards,
Peter Jacobi

Thread: [Firebird-net-provider] Character set troubles

A powerful, cross platform, SQL database system

firebird-net-provider