From: Kurt D. Z. <Kurt@OpenLDAP.org> - 2000-08-23 14:26:27
|
At 07:56 AM 8/18/00 -0500, Mark Wilcox wrote: >as I understand it (I"m sure Chris or Kurt will correct me :), the >'native' string format for LDAP 3 is utf8. More precisely, LDAPv3 strings are UTF-8 encoded ISO/IEC 10646-1. LDAPv2 strings are T.61. Most APIs I use (C and Perl) are charset neutral. That is, they require the client to provide properly encoded strings per the protocol in use. Notable exceptions are Java based APIs. Java supports Unicode, albeit a 16-bit version. This is actually quite problematic. Besides the obvious 16-bit<->31-bit issue and the Unicode<->T.61 issues, the API is generally not schema aware. The attribute value in question may require some other encoding. I believe it best for low-level protocol APIs to be dumb. That is, they should provide as direct as possible interface between the application and the protocol. The more stuff this API does on behalf of the application, the less applications can do with it. I am, however, a big fan of layering high-level APIs on top of lower-level ones... In terms of charset issues, I rather the low-level API just pass a string of octets. Kurt |