From: Alain S. <asp...@gm...> - 2007-05-29 14:29:46
|
Hi Michael When investigating about python and unicode, I read somewhere (in a PEP I thing) that python functions should accept and manage unicode string as well as normal string.Of course if these strings could contains user readable characters. This is not the case for python-ldap functions. Sometime when calling python-ldap, we don't know well the origin ( a function is not suposed to know its caller :-) of the arguments we are using : user input, web interface, mysql database, ldap result, web form, literal, text file, text parsing... some are unicode, other not. I thing python-ldap function must accept unicode arguments. As we discussed at length previously, the decoding of the result is less easy because, the library cannot guess alone the meaning of these values. I'm not supporting the idea of downloading and use the ldap schema. I cannot imaging a connection less application like a web application doing that at any request! Or keeping a cache for the schema ... Anyway I see 2 solutions 1. Let result() return non unicode strings. _HERE_ The user know all returned strings are normal strings utf-8 encoded and he can do the encoding himself. A helper function doing the job for the result structure should be welcome. 2. Do the conversion regarding the info provided in the query, as my source sample does. I answer now some of your previous comment: > > In this case maybe is it possible to use [ '*', u'givenName', u'sn' ] > > to convert only 'givenName' and 'sn' > But then you will not gain much! Still the application has to know which > attributes have to be converted. =3D> It's not worth hiding the conversio= n > within python-ldap. I don't really hide the conversion, because the user has to request it usin= g unicode field name. And second, I do more work: I keep a link between the msgid and the request to know with fields I have to convert and also destroy the link when unneeded anymore. > The only clean solution would be something involving LDAP schema processing! You know better than me how costly it is, in developing time, and its overhead for CPU and network load. Do you really consider to add the schema processing for unicode integration in the future? Or are you hoping that someone will send you a patch :-) ? I know you were not very exited by my ideas, anyway the unicode support for argument encoding is important. (this is my opinion) Feel free to suggest some cosmetic changes: function name, class name, the way I wrap your base class ..... Keep in mind, none of my code break compatibility with existing application= . Best regards. On 5/24/07, Alain Spineux <asp...@gm...> wrote: > > > > On 5/24/07, Michael Str=F6der <mi...@st...> wrote: > > > > Alain Spineux wrote: > > > > > > Yes but what about unknown field type ? > > > > If you really want to dive into this look in directory > > pylib/w2lapp/schema/ of web2ldap's source. It works for me but I did no= t > > consider this whole framework mature enough to be incorporated into > > python-ldap. > > > I dont want to look at the schema: > > Here are the sources and the results. > I use your more appropriate name for unicode testing :-) > > > #!/usr/bin/env python2.4 > > import sys, os, time > import ldap, ldapurl, ldap.modlist > import types > import datetime > > host=3D'localhost' > port=3D389 > base_dn=3D'dc=3Dasxnet,dc=3Dloc' > > if True: > who=3D'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' > cred=3D''********' > else: > who=3D'cn=3Dnobody,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' > cred=3D'iMmTWz5pJ+lwY7i6M/BU61ngo1aBLyqQhRrrKbEc' > > > def unicode2utf8(st): > """Convert unicode (and only unicode) string into utf-8 raw string as > expected by ldap""" > > if isinstance(st, types.UnicodeType): > return st.encode('utf-8') > else: > return st > > def utf82unicode(st): > """encode st into utf-8""" > return st.decode('utf-8') > > > def encode_modlist(modlist, no_op): > """encode ldap modlist structure > set no_op=3DTrue for Tuple of kind (int,str,[str,...]) > and False for (str, [str,...]) > """ > > for i, mod in enumerate(modlist): > if no_op: > attr_name, attr_values=3Dmod > else: > op, attr_name, attr_values=3Dmod > > attr_name=3Dunicode2utf8(attr_name) > if isinstance(attr_values, ( types.ListType, types.TupleType)): > attr_values=3Dmap(unicode2utf8, attr_values) > else: > attr_values=3Dunicode2utf8(attr_values) > if no_op: > modlist[i]=3D(attr_name, attr_values) > else: > modlist[i]=3D(op, attr_name, attr_values) > > return modlist > > class UnicodeLDAPObject(ldap.ldapobject.LDAPObject): > > expiration_delay=3D300 > > def __init__(self, uri, **kwargs): > ldap.ldapobject.LDAPObject.__init__(self, uri, **kwargs) > self.unicode_decoder=3D{} # (msgid, expiration, decoder_data) > # I use an expiration time to avoid the list to become to big whe= n > the > # server don't answere any request > > def search_ext(self,base,scope, filterstr, attrlist, *args, **kwargs)= : > # base,scope, > filterstr=3D'(objectClass=3D*)',attrlist=3DNone,attrsonly=3D0,serverctrls= =3DNone,clientctrls=3DNone,timeout=3D-1,sizelimit=3D0 > > > # convert filter > filterstr=3Dunicode2utf8(filterstr) > > # convert arglist and keep a copy of original values for later > decoding > > u_attrlist=3Dattrlist > decoder=3D{} > if u_attrlist!=3DNone: > attrlist=3D[] > for attr in u_attrlist: > if isinstance(attr, types.UnicodeType): > attr=3Dattr.encode('utf-8') > # print 'ATTR', attr > decoder[attr]=3DTrue > attrlist.append(attr) > > msgid=3Dldap.ldapobject.LDAPObject.search_ext(self,base,scope, > filterstr, attrlist, *args, **kwargs) > > if decoder: > timeout=3Dkwargs.get('timeout', None) > if timeout=3D=3DNone or timeout<=3D0: > timeout=3Dself.expiration_delay > self.unicode_decoder[msgid]=3D(msgid, datetime.datetime.now()= +datetime.timedelta(seconds=3Dtimeout), decoder) > return msgid > > def result3(self, *args, **kwargs): > # kwargs=3D(self, msgid=3D_ldap.RES_ANY,all=3D1,timeout=3DNone): > rtype, rdata, rmsgid, decoded_serverctrls=3D > ldap.ldapobject.LDAPObject.result3(self, *args, **kwargs) > > if self.unicode_decoder.has_key(rmsgid): > msgid, expire, decoder=3Dself.unicode_decoder[rmsgid] > if rtype not in [ ldap.RES_SEARCH_ENTRY, > ldap.RES_SEARCH_REFERENCE ]: > # this was the last result > del self.unicode_decoder[rmsgid] > else: > # reset the timeout > timeout=3D kwargs.get('timeout', None) > if timeout=3D=3DNone or timeout<=3D0: > timeout=3Dself.expiration_delay > self.unicode_decoder[msgid]=3D(msgid, datetime.datetime.n= ow()+datetime.timedelta(seconds=3Dtimeout), > decoder) > > # now decode the result > if rdata: > if rtype in [ldap.RES_SEARCH_ENTRY, > ldap.RES_SEARCH_REFERENCE, ldap.RES_SEARCH_RESULT]: > # FIXME: I dont know what is a RES_SEARCH_REFERENCE > rdata_u=3D[] > for i, (dn, attrs) in enumerate(rdata): > # FIXME: should I handle the 'dn' the same way > if decoder.has_key ('dn'): > dn=3Dutf82unicode(dn) > for key in attrs.keys(): > if decoder.has_key(key): > attrs[key]=3Dmap(utf82unicode, attrs[key]= ) > # print '\tITEM=3D', dn, attrs > rdata[i]=3D(dn, attrs) > > else: > # no decoder for this =3D> nothing to decode > pass > > # remove other expired decoder info > now=3Ddatetime.datetime.now() > for msgid in self.unicode_decoder.keys(): > if self.unicode_decoder[rmsgid][1]<now: > del self.unicode_decoder[rmsgid] > > return rtype, rdata, rmsgid, decoded_serverctrls > > def add_ext(self, dn, modlist, *args, **kwargs): > # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) > dn=3Dunicode2utf8(dn) > # print 'MODLIST', modlist > modlist=3Dencode_modlist(modlist, True) > # print 'MODLIST unicode', modlist > return ldap.ldapobject.LDAPObject.add_ext (self, dn, modlist, > *args, **kwargs) > > def modify_ext(self, dn, modlist, *args, **kwargs): > # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) > dn=3Dunicode2utf8(dn) > # print 'MODLIST', modlist > modlist=3Dencode_modlist(modlist, False) > # print 'MODLIST unicode', modlist > return ldap.ldapobject.LDAPObject.modify_ext(self, dn, modlist, > *args, **kwargs) > > def delete_ext(self, dn, *args, **kwargs): > # args=3D(self,dn,serverctrls=3DNone,clientctrls=3DNone) > dn=3Dunicode2utf8(dn) > return ldap.ldapobject.LDAPObject.delete_ext(self, dn, *args, > **kwargs) > > > > def print_ldap_result(ldap_result): > for dn, item in ldap_result: > print 'DN=3D', repr(dn) > for k, v in item.iteritems(): > print '\t%s: %s' % (k, repr(v)) > print > > ldap_url=3Dldapurl.LDAPUrl ('ldap://%s:%d/%s' % (host, port, base_dn)) > ldap_url.applyDefaults({ > 'who': who, > 'cred' : cred, }) > #l=3Dldap.ldapobject.LDAPObject(ldap_url.initializeUrl()) > l=3DUnicodeLDAPObject(ldap_url.initializeUrl()) > l.simple_bind_s(ldap_url.who, ldap_url.cred) > print 'Connected as', l.whoami_s() > > > first_name=3D'Michael' > first_name2=3Du'Micha\xebl' > last_name=3Du'Str\xf6der' > email=3D' mi...@st...' > street=3Du'Hauptstra\xe1e' > country=3D'Germany' > > cn=3D'%s %s' %(first_name, last_name) > dn=3D'cn=3D%s,%s' %(cn, base_dn) > info=3D{ > u'cn' : (cn, ), > 'mail' : (email, ), > 'objectClass' : ('top', 'inetOrgPerson', 'kolabInetOrgPerson',), > u'sn' : (last_name, ), > u'givenName' : (first_name, ), > u'street': (street, ), > 'c': (country, ), > 'telephoneNumber': '+49 1111111111', > } > > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL , '(cn=3D%s)' % (cn= ,) , > info.keys()) > if ldap_result: > print '=3D=3D Found' > print_ldap_result(ldap_result) > l.delete_s(dn) > print '=3D=3D Deleted' > > l.add_s(dn, ldap.modlist.addModlist (info)) > print '=3D=3D Created' > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,= ) , > info.keys()) > print_ldap_result(ldap_result) > > l.modify_s(dn, [(ldap.MOD_REPLACE, u'givenName', first_name2), > (ldap.MOD_ADD, 'telephoneNumber', ( '+49 1234567890', )), > ]) > > print '=3D=3DModified' > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,= ) , > info.keys()) > print_ldap_result(ldap_result) > > print '=3D=3DDisplay once more' > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,= ) , > ['*', '+', u'dn', u'givenName', u'creatorsName'] ) > print_ldap_result(ldap_result) > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > Connected as dn:cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc > =3D=3D Found > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: ['mi...@st...'] > givenName: [u'Micha\xebl'] > > =3D=3D Deleted > =3D=3D Created > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: ['mi...@st...'] > givenName: [u'Michael'] > > =3D=3DModified > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: [' mi...@st...'] > givenName: [u'Micha\xebl'] > > =3D=3DDisplay more > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: ['mi...@st...'] > givenName: [u'Micha\xebl'] > > =3D=3DDisplay once more > DN=3D u'cn=3DMichael Str\xf6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > entryCSN: ['20070524191126Z#000002#00#000000'] > cn: ['Michael Str\xc3\xb6der'] > entryDN: ['cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc'] > createTimestamp: ['20070524191126Z'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > creatorsName: [u'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc'= ] > entryUUID: ['5099e82e-9e76-102b-830b-0da78c7bd35e'] > hasSubordinates: ['FALSE'] > modifiersName: ['cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc'= ] > street: ['Hauptstra\xc3\xa1e'] > sn: ['Str\xc3\xb6der'] > structuralObjectClass: ['inetOrgPerson'] > subschemaSubentry: ['cn=3DSubschema'] > mail: [' mi...@st...'] > givenName: [u'Micha\xebl'] > modifyTimestamp: ['20070524191126Z'] > > > > > > -- > -- > Alain Spineux > aspineux gmail com > May the sources be with you > --=20 -- Alain Spineux aspineux gmail com May the sources be with you |