From: Graham B. <gb...@po...> - 2002-01-08 16:28:01
|
On Sun, Jan 06, 2002 at 01:45:17PM +0100, Peter Marschall wrote: > Hi, > > i have tried to use canonical_dn with UTF 8 and I think > it treats strings with UTF8 encoded values wrong. It probably does. > Characters with codes > 127 have UTF8 encodings that > consist of 2 or more bytes that have all codes > 127. > Since these characters are legal in LDAPv3 DNs they should > not get escaped. True, assuming the DN given is UTF8, which it should be with LDAPv3 but not v2 But the escaping should be done on the basis of the character being printable. Net::LDAP makes a bad assumption here. > So line 310 of Net/LDAP/Util.pm should read > > $val =~ s/([\x00-\x1f])/sprintf("\\%02x",ord($1))/eg; > > instead of the current version: > > $val =~ s/([\x00-\x1f\x7f-\xff])/sprintf("\\%02x",ord($1))/eg; > > > When changing canonical_dn() anyway, maybe changing the > implementation into three functions would be helpful. > It would give users of Net::LDAP a standardized way of dealing with > DNs and parts of it (very helpful when moving entries, ..) without having > to reimplement the wheel themselves. I have thought about this before, just never done it :) > Here is my idea: Looks good. Graham. > > > ## split a DN string into its parts; code stolen from canonical_dn() ## > # Synopsis: @rdns = splitDN($dn, %optionHash) > # allowed options: > # * lowercase: convert attribute names to lower case > # * uppercase: convert attribute names to upper case > # * sortRDN: sort RDN values > # * splitRDN: split multi part RDNs into their parts > sub splitDN($%) > { > my $dn = shift; > my %opt = @_; > my @dn; > my @rdn; > > $dn = $dn->dn if ref($dn); > > while ($dn =~ /\G(?: > \s* > ([a-zA-Z][-a-zA-Z0-9]*|(?:[Oo][Ii][Dd]\.)?\d+(?:\.\d+)*) > \s* > = > \s* > ( > (?:[^\\",=+<>\#;]*[^\\",=+<>\#;\s]|\\(?:[\\ > ",=+<>#;]|[0-9a-fA-F]{2}))* > | > \#(?:[0-9a-fA-F]{2})+ > | > "(?:[^\\"]+|\\(?:[\\",=+<>#;]|[0-9a-fA-F]{2}))*" > ) > \s* > (?:([;,+])\s*(?=\S)|$) > )\s*/gcx) > { > my ($type,$val,$sep) = ($1,$2,$3); > > $type =~ s/^oid\.(\d+(\.\d+)*)$/$1/i; > $type = lc($type) if ($opt{lowercase}); > $type = uc($type) if ($opt{uppercase}); > > if ($val !~ /^#/) > { > $val =~ s/^"(.*)"$/$1/; > $val =~ s/\\([\\ ",=+<>#;]|[0-9a-fA-F]{2}) > /length($1)==1 ? $1 : chr(hex($1)) > /xeg; > $val =~ s/([\\",=+<>#;])/\\$1/g; > $val =~ s/([\x00-\x1F])/sprintf("\\%02x",ord($1))/eg; > > $val =~ s/(^\s+|\s+$)/"\\20" x length $1/ge; > } > > push @rdn, "$type=$val"; > > unless (defined $sep and $sep eq '+') > { > @rdn = sort(@rdn) if ($opt{sortRDN}); > push @dn, ($opt{splitRDN}) ? > ((scalar(@rdn) > 1) ? [ @rdn ] : ($rdn[0] || '')) : > join('+', @rdn); > @rdn = (); > } > } > > return((length($dn) != (pos($dn) || 0)) ? () : @dn); > } > > > ## join RDNs and RDN parts into a DN string ## > # Synopsis: $dn = joinDN(@dnpartref, %optionhash) > sub joinDN(\@%) > { > my @dnparts = @i{+shift}; > my %opt = @_; > my $dn = ''; > > @dnparts = reverse(@dnparts) if ($opt{reversed}); > > foreach my $part (@dnparts) > { > $dn .= (($opt{reversed}) ? \000 : ',') if ($dn); > > if (ref($part)) # multi part RDN > { > my $partlist = ($opt{revered}) ? reverse(@$part) : @$part; > my $rdn; > > foreach my $rdnpart (@partlist) > { > return if (!$rdnpart); > > $rdn .= (($opt{reversed}) ? \001 : '+') if ($rdn); > $rdn .= $rdnpart; > } > $dn .= $rdn; > } > else # single part RDN > { > return if (!$part); > > $dn .= $part; > } > } > > return($dn); > } > > > These two basic functions now allow to implement > canonical_dn() with only a few lines: > > sub canonical_dn($;$) { > my ($dn, $rev) = @_; > > $dn = $dn->dn if ref($dn); > > my @dnparts = splitDN($dn, uppercase => 1, splitRDN => 1, sortRDN => 1); > > joinDN(@dnparts, reversed => ($rev||0)); > } > > > > > Yours > Peter > > -- > Peter Marschall | eMail: pet...@ma... > Scheffelstraße 15 | pet...@is... > 97072 Würzburg | Tel: 0931/14721 > PGP: D7 FF 20 FE E6 6B 31 74 D1 10 88 E0 3C FE 28 35 > |