From: Chris R. <chr...@me...> - 2001-06-19 09:15:37
|
"R. Reucher" <ren...@cp...> wrote: > I subscribed to list meanwhile, no need to CC to me... OK! > Chris Ridd wrote: >> One point you might need to consider is that many of the underlying >> string types used in BER (which is the ISO standard that describes how >> to encode the LDAP protocol for network transmission) are fundamentally >> ASCII-based. If you mess with that, you've broken LDAP and won't be able >> to talk to any LDAP servers :-( > Correct, and that's part of the problem already ;-(... > >> (Graham, Convert::ASN1's 'make test' should test that strings are being >> encoded using the correct character sets. Testing "PrintableString" would >> be a good start.) When happens when you run this code as is, and when you translate the 'cn=Chris Ridd' :-) string into ASCII: ---8<--- use strict; use Convert::ASN1; use Convert::ASN1::Debug; my $asn = Convert::ASN1->new; $asn->prepare(q< name PrintableString >); my $pdu = $asn->encode(name => 'cn=Chris Ridd'); if ($pdu ne "\x13\x0D\x63\x6E\x3D\x43\x68\x72\x69\x73\x20\x52\x69\x64\x64") { print "The value you set for name uses the wrong character set.\n"; } my $o = $asn->decode($pdu); Convert::ASN1::asn_hexdump(*STDERR, $pdu); ---8<--- >> So basically everything that you pass into Net::LDAP objects *must* be >> ASCII or UTF-8 (if you are using LDAPv3.) > I know that. The question is, _where_ excatly do I need to convert (back > and forth) ? I think I said - every value you pass into Net::LDAP from your program and every value you get out again. Since Net::LDAP doesn't mess around with values at all, the bytes you pass in should ultimately end up 'on the wire'. >> I really don't know anything at all about perl on the 390, so this might >> be a stupid question: what character set is used in the variables in your >> program? > It's EBCDIC, which (in more "standardized" form) is "IBM-1047" (as > opposed to "ISO8859-1", which is ASCII). ISO 8859-1 is not ASCII. There is an overlap, but ASCII is only a 7-bit character set and Latin-1 defines 8-bit characters. But I guess your ISO 8859-1 converter will convert ASCII values correctly, since ISO 8859-1 is mostly a superset of ASCII. Is using single-quoted strings different from double-quoted strings? >> If they're EBCDIC, you *need* to ASCII-ify (or UTF-8-ify) them first. >> Similarly, you *need* to EBCDIC-ify all values that come back from the >> LDAP server, as they are either ASCII-based or UTF-8 (if you are using >> LDAPv3) > That's what I want to do (or tried to do) using "iconv"... > > Here's what I did to convert: > > $ascii_data = `echo '$ebcdic_data' | iconv -f IBM-1047 -t ISO8859-1` > > or > > $ebcdic_data = `echo '$ascii_data' | iconv -f ISO8859-1 -t IBM-1047` > > There's also a C API iconv() routine (UNIX98), but that's trickier to > use in perl... and it should make no difference, basically. > > The problem is, I'm searching for the correct portions of code within > the Net::LDAP module _where_ I should convert _what_ !?! From your original message to Graham: > 1: #!/usr/local/bin/perl > 2: use Net::LDAP; > 3: $ldap = Net::LDAP->new('ldapsrv.ourdomain.com') or die "$@"; Add debug => 12 after the hostname to get protocol debugging. This may help us a lot. Maybe debug => 3 is more useful, as it gives us plain hex. > 4: $ldap->bind(dn => 'cn=Directory Manager, o=OurCustomer', That dn will need to be converted into ASCII. > 5: password => 'xxxxxxxx'); That's tricky. The password syntax is OCTET STRING, which means 'just the bytes, no character set is implied' so whatever the directory server thinks the bytes are is needed here. > 6: $mesg = $ldap->search(base => "o=OurCustomer", That search base needs changing to ASCII. > 7: filter => "(uid=*)"); That filter needs changing to ASCII. > 8: $ldap->unbind; > 9: $mesg->code && die $mesg->error; > 10: foreach $entry ($mesg->all_entries) { $entry->dump; } For testing, the dump() method's OK but for real use you'll need to convert all the values you get back from ASCII into EBCDIC. > Anyhow, thanks for the answer ! Hope I'll get more input... > > Rene > -- > R. Reucher voice: +49/621/4803-174 > COMPAREX GmbH, VL40 fax: +49/621/4803-141 > Mannheimerstr. 105 e-mail: ren...@cp... > D-68535 Edingen-Neckarhausen > I'm guessing that you will want to use umlauts etc in your directory, in which case I would strongly recommend using LDAPv3 if possible - you need to explicitly request this in the bind operation - as then all values are UTF-8. You should do that because many LDAP servers handle character sets wrongly (more importantly, differently) when using LDAPv2, so your code will then not depend on the possibly non-standard behaviour of a particular LDAP server. (So whereever I said ASCII above, I mean UTF-8.) Cheers, Chris |