From: Knut S. <knu...@se...> - 2001-06-05 17:43:07
|
Christoph Neumann wrote: > On Tue, 5 Jun 2001, Knut Sander wrote: > > > I chose the directoryString (UTF-8 format) type since it should allow the > > > international characters. However, when I try to insert the string 'im > > > k=E4ppele 8' I get the error: > > > > > > apuhomestreet: value #0 contains invalid data > > > > Hi Christoph, > > > > did you encode your data as UTF8-string? The 0xe4 above (ä = ae) looks > > like you try to add a latin1 string, but this is not a legal UTF8 byte > > sequence. > > Hm...that seems to be correct. I checked the output from the "debug". > This is the string I am sending to the server: > 0040 04 13: STRING = 'apuhomestreet' > 004F 31 14: SET { > 0051 04 12: STRING > 0053 : 69 6D 20 6B E4 70 70 65 6C 65 20 38 __ __ __ __ im k.ppele 8 > 005F : } ok - E4 for ae is latin1, UTF8 needs 2 bytes for this. > Any recommendation on which encoding I should user in LDAP to support > international characters? Is UTF8 really the way to go? I think it is way =) > If UTF8 is the way to go, how should I go about converting data that is > in iso-8859-1 to UTF8? A quick search on CPAN turned up > "Unicode::MapUTF8" and "use utf8" pragma in perl 5.7. Anyone have > experience with either of these? > > Also, where might I find good documentation on how these character sets > are defined? I used Unicode::String, take a look at the example on perldoc Unicode::String, it work well and is easy to handle. I have good experiences by building the en/decoding into the application specific LDAP-layer (you allways need this for larger applications =) 'use utf8' in perl 5.6/7 may do this job now on the fly, but I did not play with it until now, because I can't use 5.6 on productive systems at the moment =(. Some pointer for this would be welcome =) - Knut __________________________________________ SecureNet GmbH - http://www.secure-net.de/ |