From: Alain S. <asp...@gm...> - 2007-05-23 17:16:05
|
Why does python-ldap not convert automatically any unicode string into UTF-8 before to call the raw function ? Is-it by design or just a lack of time ? The idea is to replace any argument using this function. def convert2raw(st): if isinstance(st, types.UnicodeType) return st.encode('utf-8') else return st Easy for the filter argument of the "search" function, more work for structure used by "modify" and other, but still easy ? This way we keep compatibility with older code. Any comment ? Ops: don't forget to make the the decoding of ldap result :-) Best regards -- -- Alain Spineux aspineux gmail com May the sources be with you |
From: <mi...@st...> - 2007-05-23 21:56:46
|
Alain Spineux wrote: > Why does python-ldap not convert automatically any unicode string into > UTF-8 before > to call the raw function ? > > Is-it by design or just a lack of time ? Kind of a FAQ. Feel free to propose a patch with a good(!) solution. https://sourceforge.net/tracker/?func=detail&atid=352072&aid=616567&group_id=2072 Being the author of a schema-aware LDAP client based on python-ldap I know in how many detail problems you can run into... Ciao, Michael. |
From: Alain S. <asp...@gm...> - 2007-05-23 22:37:46
|
On 5/23/07, Michael Str=F6der <mi...@st...> wrote: > Alain Spineux wrote: > > Why does python-ldap not convert automatically any unicode string into > > UTF-8 before > > to call the raw function ? > > > > Is-it by design or just a lack of time ? > > Kind of a FAQ. Feel free to propose a patch with a good(!) solution. > > https://sourceforge.net/tracker/?func=3Ddetail&atid=3D352072&aid=3D616567= &group_id=3D2072 > > Being the author of a schema-aware LDAP client based on python-ldap I > know in how many detail problems you can run into... They are two problems: - First, encode data when sending request to ldap server. Do you know an drawback to encode any unicode sting into utf-8 ? -Second, decode data coming back from the server. The only function to retrieve data is search(), right ? what about the use of this kind of request ? ldap_con.search_s(base_dn, ldap.SCOPE_BASE, filter, ['cn', 'mail', u'givenName', u'sn' ]) that way, python-lib know that attribute givenName and sn should be decoded into unicode string. Do you see any other problems ? > > Ciao, Michael. > --=20 -- Alain Spineux aspineux gmail com May the sources be with you |
From: <mi...@st...> - 2007-05-24 07:20:20
|
Alain Spineux wrote: > On 5/23/07, Michael Ströder <mi...@st...> wrote: >> >> https://sourceforge.net/tracker/?func=detail&atid=352072&aid=616567&group_id=2072 > > - First, encode data when sending request to ldap server. > Do you know an drawback to encode any unicode sting into utf-8 ? Did you read my comment in the tracker item above? I repeat it here: "One cannot always assume UTF-8 encoded Unicode strings in attribute values. Think of attributes jpegPhoto, userCertificate, etc. Therefore there's no generic way to handle that at the API level without knowledge about the syntax defined in the LDAP schema. (Note that this has been debated to death on python-ldap-dev mailing list.)" > -Second, decode data coming back from the server. > The only function to retrieve data is search(), right ? Not necessarily. Think of LDAP extended operations, like whoami_s(). > what about the use of this kind of request ? > ldap_con.search_s(base_dn, ldap.SCOPE_BASE, filter, ['cn', 'mail', > u'givenName', u'sn' ]) > > that way, python-lib know that attribute givenName and sn should be > decoded into unicode string. This will be cumbersome. You will have to pass these attributes into method result() which in general has no knowledge about a particular request it's receiving results for. And I'll keep it like this because I still have the idea for implementing a connection pool with less thread locking. Also it's not only a question of Unicode or not. It's very simple to implement a small wrapper class sufficient for simple applications. But I'm against doing this in the general python-ldap API. Ciao, Michael. |
From: Alain S. <asp...@gm...> - 2007-05-24 12:34:43
|
On 5/24/07, Michael Str=F6der <mi...@st...> wrote: > Alain Spineux wrote: > > On 5/23/07, Michael Str=F6der <mi...@st...> wrote: > >> > >> https://sourceforge.net/tracker/?func=3Ddetail&atid=3D352072&aid=3D616567&g= roup_id=3D2072 > > > > - First, encode data when sending request to ldap server. > > Do you know an drawback to encode any unicode sting into utf-8 ? > > Did you read my comment in the tracker item above? I repeat it here: Yes I did ! > > "One cannot always assume UTF-8 encoded Unicode strings in > attribute values. Think of attributes jpegPhoto, When writing my ldap.search() request I know jpegPhoto is raw data and not = a string ! This is why I will use 'jpegPhoto' and not u'jpegPhoto' in the attribute list. On the other hand, I will never use a unicode string for a jpegPhoto value in ldap.modify() or ldap.add(). > userCertificate, etc. Therefore there's no generic way to > handle that at the API level without knowledge about the > syntax defined in the LDAP schema. > (Note that this has been debated to death on python-ldap-dev > mailing list.)" > > > -Second, decode data coming back from the server. > > The only function to retrieve data is search(), right ? > > Not necessarily. Think of LDAP extended operations, like whoami_s(). > > > what about the use of this kind of request ? > > ldap_con.search_s(base_dn, ldap.SCOPE_BASE, filter, ['cn', 'mail', > > u'givenName', u'sn' ]) > > > > that way, python-lib know that attribute givenName and sn should be > > decoded into unicode string. > > This will be cumbersome. You will have to pass these attributes into > method result() which in general has no knowledge about a particular > request it's receiving results for. And I'll keep it like this because I Ops, I forgot the asynchronous side of ldap, but the msgid make the link between both the request and the result and a dictionary store in the ldapobject could store the unicode transcoding info used in the request. And then ldap.result(), could use these info to decode the value when user call it. > still have the idea for implementing a connection pool with less thread > locking. Also it's not only a question of Unicode or not. [OFF-TOPIC] You are speaking about thread, I used asynchat for 2 different projects. This could be an option for python-ldap > It's very simple to implement a small wrapper class sufficient for > simple applications. But I'm against doing this in the general > python-ldap API. Yes of course a wrapper around each function to convert input and output. Thanks for your advice. I'start writin a wrapper for search() and modify() now. I send you my results as soon a possible. Best regards. Alain -- Alain Spineux aspineux gmail com May the sources be with you |
From: <mi...@st...> - 2007-05-24 12:58:58
|
Alain Spineux wrote: > > On 5/24/07, Michael Ströder <mi...@st...> wrote: >> >> "One cannot always assume UTF-8 encoded Unicode strings in >> attribute values. Think of attributes jpegPhoto, > > When writing my ldap.search() request I know jpegPhoto is raw data and > not a string ! > This is why I will use 'jpegPhoto' and not u'jpegPhoto' in the > attribute list. And how about specials like '*' and '+' in attrlist? >> > what about the use of this kind of request ? >> > ldap_con.search_s(base_dn, ldap.SCOPE_BASE, filter, ['cn', 'mail', >> > u'givenName', u'sn' ]) >> > >> > that way, python-lib know that attribute givenName and sn should be >> > decoded into unicode string. >> >> This will be cumbersome. You will have to pass these attributes into >> method result() which in general has no knowledge about a particular >> request it's receiving results for. And I'll keep it like this because I > > Ops, I forgot the asynchronous side of ldap, but the msgid make the link > between both > the request and the result and a dictionary store in the ldapobject > could store the > unicode transcoding info used in the request. And then ldap.result(), > could use these > info to decode the value when user call it. Yes, you could do this. But IMO it's cumbersome. > I'start writin a wrapper for search() and modify() now. > > I send you my results as soon a possible. I'm not too keen to incorporate Unicode patches in python-ldap's low-level API... Ciao, Michael. |
From: Alain S. <asp...@gm...> - 2007-05-24 13:32:12
|
On 5/24/07, Michael Str=F6der <mi...@st...> wrote: > > Alain Spineux wrote: > > > > On 5/24/07, Michael Str=F6der <mi...@st...> wrote: > >> > > And how about specials like '*' and '+' in attrlist? Thanks for showing me my ignorance. I was not knowing about * and + ! Easy, only named attributes will be converted! In this case maybe is it possible to use [ '*', u'givenName', u'sn' ] to convert only 'givenName' and 'sn' > > Ops, I forgot the asynchronous side of ldap, but the msgid make the lin= k > > between both > > the request and the result and a dictionary store in the ldapobject > > could store the > > unicode transcoding info used in the request. And then ldap.result(), > > could use these > > info to decode the value when user call it. > > Yes, you could do this. But IMO it's cumbersome. YES and I hope you will integrate my patch in your mainstream source code :-) > I'start writin a wrapper for search() and modify() now. > > > > I send you my results as soon a possible. > > I'm not too keen to incorporate Unicode patches in python-ldap's > low-level API... not low-level Here is my work on progress class UnicodeLDAPObject(LDAPObject): expiration_delay=3D300 def __init__(self, uri, **kwargs): LDAPObject.__init__(self, uri, **kwargs) self.unicode_decoder=3D{} self.unicode_decoder_expire=3D[] def search_ext(self,base,scope, **kwargs): # filterstr=3D'(objectClass=3D*)',attrlist=3DNone,attrsonly=3D0,serverctrls= =3DNone,clientctrls=3DNone,timeout=3D-1,sizelimit=3D0 # convert filter try: kwargs['filter']=3Dunicode2utf8(kwargs['filter']) except KeyError: pass # convert arglist and keep a copy of original values for later decoding try: attrlist=3Dkwargs['attrlist'] kwargs['attrlist']=3Dmap(unicode2utf8(kwargs['attrlist'])) except KeyError: attrlist=3DNone mesgid=3DLDAPObject.search_ext(self,base,scope, **kwargs) if attrlist: self.unicode_decoder[mesgid]=3Dattrlist self.unicode_decoder_expire.append((mesgid, datetime.datetime.now()+datetime.timedelta(seconds=3Dself.expiration_delay)= )) return mesgid def result3(self, **kwargs): # msgid=3D_ldap.RES_ANY,all=3D1,timeout=3DNone): rtype, rdata, rmsgid, decoded_serverctrls=3DLDAPObject.result3(self= , **kwargs) Alain -- Alain Spineux aspineux gmail com May the sources be with you |
From: <mi...@st...> - 2007-05-24 14:03:41
|
Alain Spineux wrote: > > On 5/24/07, *Michael Ströder* <mi...@st... > <mailto:mi...@st...>> wrote: > > Alain Spineux wrote: > > > > On 5/24/07, Michael Ströder <mi...@st... > <mailto:mi...@st...>> wrote: > >> > > And how about specials like '*' and '+' in attrlist? > > Thanks for showing me my ignorance. I was not knowing about * and + ! > Easy, only named attributes will be converted! > In this case maybe is it possible to use [ '*', u'givenName', u'sn' ] > to convert only 'givenName' and 'sn' But then you will not gain much! Still the application has to know which attributes have to be converted. => It's not worth hiding the conversion within python-ldap. > > I'start writin a wrapper for search() and modify() now. > > > > I send you my results as soon a possible. > > I'm not too keen to incorporate Unicode patches in python-ldap's > low-level API... > > not low-level > > Here is my work on progress > > class UnicodeLDAPObject(LDAPObject): Sorry, I consider ldap.ldapobject to be low-level. The only clean solution would be something involving LDAP schema processing! Ciao, Michael. |
From: Alain S. <asp...@gm...> - 2007-05-24 14:50:16
|
On 5/24/07, Michael Str=F6der <mi...@st...> wrote: > > Alain Spineux wrote: > > > > On 5/24/07, *Michael Str=F6der* <mi...@st... > > <mailto:mi...@st...>> wrote: > > > > Thanks for showing me my ignorance. I was not knowing about * and + ! > > Easy, only named attributes will be converted! > > In this case maybe is it possible to use [ '*', u'givenName', u'sn' ] > > to convert only 'givenName' and 'sn' > > But then you will not gain much! Still the application has to know which > attributes have to be converted. =3D> It's not worth hiding the conversio= n > within python-ldap. I don't spend too much then I dont't expect too much too :-) Are you using often '+' and '*' (in real application, not ldap browser, no= r editor tools) ? And that way, user keep control with explicit conversion. > > > class UnicodeLDAPObject(LDAPObject): > > Sorry, I consider ldap.ldapobject to be low-level. :-( The only clean solution would be something involving LDAP schema processing= ! Yes but what about unknown field type ? Ciao, Michael. > Your remarks help me, don't hesitate to share your feeling. --=20 -- Alain Spineux aspineux gmail com May the sources be with you |
From: <mi...@st...> - 2007-05-24 16:58:31
|
Alain Spineux wrote: > > Are you using often '+' and '*' (in real application, not ldap browser, > nor editor tools) ? In small scripts I'm explicitly requesting certain attribute types. Hmm, not sure about some directory sync tools I wrote for customers which have kind of a generic processing and low-level classes create different entry objects. > And that way, user keep control with explicit conversion. But I suspect that most programmers use '*' in their home-grown scripts. > > The only clean solution would be something involving LDAP schema > > processing! > > Yes but what about unknown field type ? If you really want to dive into this look in directory pylib/w2lapp/schema/ of web2ldap's source. It works for me but I did not consider this whole framework mature enough to be incorporated into python-ldap. Ciao, Michael. |
From: Alain S. <asp...@gm...> - 2007-05-24 19:16:25
|
On 5/24/07, Michael Str=F6der <mi...@st...> wrote: > > Alain Spineux wrote: > > > > Yes but what about unknown field type ? > > If you really want to dive into this look in directory > pylib/w2lapp/schema/ of web2ldap's source. It works for me but I did not > consider this whole framework mature enough to be incorporated into > python-ldap. I dont want to look at the schema: Here are the sources and the results. I use your more appropriate name for unicode testing :-) #!/usr/bin/env python2.4 import sys, os, time import ldap, ldapurl, ldap.modlist import types import datetime host=3D'localhost' port=3D389 base_dn=3D'dc=3Dasxnet,dc=3Dloc' if True: who=3D'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' cred=3D''********' else: who=3D'cn=3Dnobody,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' cred=3D'iMmTWz5pJ+lwY7i6M/BU61ngo1aBLyqQhRrrKbEc' def unicode2utf8(st): """Convert unicode (and only unicode) string into utf-8 raw string as expected by ldap""" if isinstance(st, types.UnicodeType): return st.encode('utf-8') else: return st def utf82unicode(st): """encode st into utf-8""" return st.decode('utf-8') def encode_modlist(modlist, no_op): """encode ldap modlist structure set no_op=3DTrue for Tuple of kind (int,str,[str,...]) and False for (str, [str,...]) """ for i, mod in enumerate(modlist): if no_op: attr_name, attr_values=3Dmod else: op, attr_name, attr_values=3Dmod attr_name=3Dunicode2utf8(attr_name) if isinstance(attr_values, (types.ListType, types.TupleType)): attr_values=3Dmap(unicode2utf8, attr_values) else: attr_values=3Dunicode2utf8(attr_values) if no_op: modlist[i]=3D(attr_name, attr_values) else: modlist[i]=3D(op, attr_name, attr_values) return modlist class UnicodeLDAPObject(ldap.ldapobject.LDAPObject): expiration_delay=3D300 def __init__(self, uri, **kwargs): ldap.ldapobject.LDAPObject.__init__(self, uri, **kwargs) self.unicode_decoder=3D{} # (msgid, expiration, decoder_data) # I use an expiration time to avoid the list to become to big when the # server don't answere any request def search_ext(self,base,scope, filterstr, attrlist, *args, **kwargs): # base,scope, filterstr=3D'(objectClass=3D*)',attrlist=3DNone,attrsonly=3D0,serverctrls= =3DNone,clientctrls=3DNone,timeout=3D-1,sizelimit=3D0 # convert filter filterstr=3Dunicode2utf8(filterstr) # convert arglist and keep a copy of original values for later decoding u_attrlist=3Dattrlist decoder=3D{} if u_attrlist!=3DNone: attrlist=3D[] for attr in u_attrlist: if isinstance(attr, types.UnicodeType): attr=3Dattr.encode('utf-8') # print 'ATTR', attr decoder[attr]=3DTrue attrlist.append(attr) msgid=3Dldap.ldapobject.LDAPObject.search_ext(self,base,scope, filterstr, attrlist, *args, **kwargs) if decoder: timeout=3Dkwargs.get('timeout', None) if timeout=3D=3DNone or timeout<=3D0: timeout=3Dself.expiration_delay self.unicode_decoder[msgid]=3D(msgid, datetime.datetime.now()+datetime.timedelta(seconds=3Dtimeout), decoder) return msgid def result3(self, *args, **kwargs): # kwargs=3D(self, msgid=3D_ldap.RES_ANY,all=3D1,timeout=3DNone): rtype, rdata, rmsgid, decoded_serverctrls=3D ldap.ldapobject.LDAPObject.result3(self, *args, **kwargs) if self.unicode_decoder.has_key(rmsgid): msgid, expire, decoder=3Dself.unicode_decoder[rmsgid] if rtype not in [ ldap.RES_SEARCH_ENTRY, ldap.RES_SEARCH_REFERENCE ]: # this was the last result del self.unicode_decoder[rmsgid] else: # reset the timeout timeout=3Dkwargs.get('timeout', None) if timeout=3D=3DNone or timeout<=3D0: timeout=3Dself.expiration_delay self.unicode_decoder[msgid]=3D(msgid, datetime.datetime.now()+datetime.timedelta(seconds=3Dtimeout), decoder) # now decode the result if rdata: if rtype in [ldap.RES_SEARCH_ENTRY, ldap.RES_SEARCH_REFERENCE, ldap.RES_SEARCH_RESULT]: # FIXME: I dont know what is a RES_SEARCH_REFERENCE rdata_u=3D[] for i, (dn, attrs) in enumerate(rdata): # FIXME: should I handle the 'dn' the same way if decoder.has_key('dn'): dn=3Dutf82unicode(dn) for key in attrs.keys(): if decoder.has_key(key): attrs[key]=3Dmap(utf82unicode, attrs[key]) # print '\tITEM=3D', dn, attrs rdata[i]=3D(dn, attrs) else: # no decoder for this =3D> nothing to decode pass # remove other expired decoder info now=3Ddatetime.datetime.now() for msgid in self.unicode_decoder.keys(): if self.unicode_decoder[rmsgid][1]<now: del self.unicode_decoder[rmsgid] return rtype, rdata, rmsgid, decoded_serverctrls def add_ext(self, dn, modlist, *args, **kwargs): # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) dn=3Dunicode2utf8(dn) # print 'MODLIST', modlist modlist=3Dencode_modlist(modlist, True) # print 'MODLIST unicode', modlist return ldap.ldapobject.LDAPObject.add_ext(self, dn, modlist, *args, **kwargs) def modify_ext(self, dn, modlist, *args, **kwargs): # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) dn=3Dunicode2utf8(dn) # print 'MODLIST', modlist modlist=3Dencode_modlist(modlist, False) # print 'MODLIST unicode', modlist return ldap.ldapobject.LDAPObject.modify_ext(self, dn, modlist, *args, **kwargs) def delete_ext(self, dn, *args, **kwargs): # args=3D(self,dn,serverctrls=3DNone,clientctrls=3DNone) dn=3Dunicode2utf8(dn) return ldap.ldapobject.LDAPObject.delete_ext(self, dn, *args, **kwargs) def print_ldap_result(ldap_result): for dn, item in ldap_result: print 'DN=3D', repr(dn) for k, v in item.iteritems(): print '\t%s: %s' % (k, repr(v)) print ldap_url=3Dldapurl.LDAPUrl('ldap://%s:%d/%s' % (host, port, base_dn)) ldap_url.applyDefaults({ 'who': who, 'cred' : cred, }) #l=3Dldap.ldapobject.LDAPObject(ldap_url.initializeUrl()) l=3DUnicodeLDAPObject(ldap_url.initializeUrl()) l.simple_bind_s(ldap_url.who, ldap_url.cred) print 'Connected as', l.whoami_s() first_name=3D'Michael' first_name2=3Du'Micha\xebl' last_name=3Du'Str\xf6der' email=3D'mi...@st...' street=3Du'Hauptstra\xe1e' country=3D'Germany' cn=3D'%s %s' %(first_name, last_name) dn=3D'cn=3D%s,%s' %(cn, base_dn) info=3D{ u'cn' : (cn, ), 'mail' : (email, ), 'objectClass' : ('top', 'inetOrgPerson', 'kolabInetOrgPerson',), u'sn' : (last_name, ), u'givenName' : (first_name, ), u'street': (street, ), 'c': (country, ), 'telephoneNumber': '+49 1111111111', } ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,) = , info.keys()) if ldap_result: print '=3D=3D Found' print_ldap_result(ldap_result) l.delete_s(dn) print '=3D=3D Deleted' l.add_s(dn, ldap.modlist.addModlist(info)) print '=3D=3D Created' ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,) = , info.keys()) print_ldap_result(ldap_result) l.modify_s(dn, [(ldap.MOD_REPLACE, u'givenName', first_name2), (ldap.MOD_ADD, 'telephoneNumber', ( '+49 1234567890', )), ]) print '=3D=3DModified' ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,) = , info.keys()) print_ldap_result(ldap_result) print '=3D=3DDisplay once more' ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,) = , ['*', '+', u'dn', u'givenName', u'creatorsName'] ) print_ldap_result(ldap_result) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Connected as dn:cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc =3D=3D Found DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' telephoneNumber: ['+49 1111111111', '+49 1234567890'] c: ['Germany'] cn: [u'Michael Str\xf6der'] objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] street: [u'Hauptstra\xe1e'] sn: [u'Str\xf6der'] mail: ['mi...@st...'] givenName: [u'Micha\xebl'] =3D=3D Deleted =3D=3D Created DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' telephoneNumber: ['+49 1111111111'] c: ['Germany'] cn: [u'Michael Str\xf6der'] objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] street: [u'Hauptstra\xe1e'] sn: [u'Str\xf6der'] mail: ['mi...@st...'] givenName: [u'Michael'] =3D=3DModified DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' telephoneNumber: ['+49 1111111111', '+49 1234567890'] c: ['Germany'] cn: [u'Michael Str\xf6der'] objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] street: [u'Hauptstra\xe1e'] sn: [u'Str\xf6der'] mail: ['mi...@st...'] givenName: [u'Micha\xebl'] =3D=3DDisplay more DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' telephoneNumber: ['+49 1111111111', '+49 1234567890'] c: ['Germany'] cn: [u'Michael Str\xf6der'] objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] street: [u'Hauptstra\xe1e'] sn: [u'Str\xf6der'] mail: ['mi...@st...'] givenName: [u'Micha\xebl'] =3D=3DDisplay once more DN=3D u'cn=3DMichael Str\xf6der,dc=3Dasxnet,dc=3Dloc' telephoneNumber: ['+49 1111111111', '+49 1234567890'] c: ['Germany'] entryCSN: ['20070524191126Z#000002#00#000000'] cn: ['Michael Str\xc3\xb6der'] entryDN: ['cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc'] createTimestamp: ['20070524191126Z'] objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] creatorsName: [u'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc'] entryUUID: ['5099e82e-9e76-102b-830b-0da78c7bd35e'] hasSubordinates: ['FALSE'] modifiersName: ['cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc'] street: ['Hauptstra\xc3\xa1e'] sn: ['Str\xc3\xb6der'] structuralObjectClass: ['inetOrgPerson'] subschemaSubentry: ['cn=3DSubschema'] mail: ['mi...@st...'] givenName: [u'Micha\xebl'] modifyTimestamp: ['20070524191126Z'] --=20 -- Alain Spineux aspineux gmail com May the sources be with you |
From: Alain S. <asp...@gm...> - 2007-05-29 14:29:46
|
Hi Michael When investigating about python and unicode, I read somewhere (in a PEP I thing) that python functions should accept and manage unicode string as well as normal string.Of course if these strings could contains user readable characters. This is not the case for python-ldap functions. Sometime when calling python-ldap, we don't know well the origin ( a function is not suposed to know its caller :-) of the arguments we are using : user input, web interface, mysql database, ldap result, web form, literal, text file, text parsing... some are unicode, other not. I thing python-ldap function must accept unicode arguments. As we discussed at length previously, the decoding of the result is less easy because, the library cannot guess alone the meaning of these values. I'm not supporting the idea of downloading and use the ldap schema. I cannot imaging a connection less application like a web application doing that at any request! Or keeping a cache for the schema ... Anyway I see 2 solutions 1. Let result() return non unicode strings. _HERE_ The user know all returned strings are normal strings utf-8 encoded and he can do the encoding himself. A helper function doing the job for the result structure should be welcome. 2. Do the conversion regarding the info provided in the query, as my source sample does. I answer now some of your previous comment: > > In this case maybe is it possible to use [ '*', u'givenName', u'sn' ] > > to convert only 'givenName' and 'sn' > But then you will not gain much! Still the application has to know which > attributes have to be converted. =3D> It's not worth hiding the conversio= n > within python-ldap. I don't really hide the conversion, because the user has to request it usin= g unicode field name. And second, I do more work: I keep a link between the msgid and the request to know with fields I have to convert and also destroy the link when unneeded anymore. > The only clean solution would be something involving LDAP schema processing! You know better than me how costly it is, in developing time, and its overhead for CPU and network load. Do you really consider to add the schema processing for unicode integration in the future? Or are you hoping that someone will send you a patch :-) ? I know you were not very exited by my ideas, anyway the unicode support for argument encoding is important. (this is my opinion) Feel free to suggest some cosmetic changes: function name, class name, the way I wrap your base class ..... Keep in mind, none of my code break compatibility with existing application= . Best regards. On 5/24/07, Alain Spineux <asp...@gm...> wrote: > > > > On 5/24/07, Michael Str=F6der <mi...@st...> wrote: > > > > Alain Spineux wrote: > > > > > > Yes but what about unknown field type ? > > > > If you really want to dive into this look in directory > > pylib/w2lapp/schema/ of web2ldap's source. It works for me but I did no= t > > consider this whole framework mature enough to be incorporated into > > python-ldap. > > > I dont want to look at the schema: > > Here are the sources and the results. > I use your more appropriate name for unicode testing :-) > > > #!/usr/bin/env python2.4 > > import sys, os, time > import ldap, ldapurl, ldap.modlist > import types > import datetime > > host=3D'localhost' > port=3D389 > base_dn=3D'dc=3Dasxnet,dc=3Dloc' > > if True: > who=3D'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' > cred=3D''********' > else: > who=3D'cn=3Dnobody,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' > cred=3D'iMmTWz5pJ+lwY7i6M/BU61ngo1aBLyqQhRrrKbEc' > > > def unicode2utf8(st): > """Convert unicode (and only unicode) string into utf-8 raw string as > expected by ldap""" > > if isinstance(st, types.UnicodeType): > return st.encode('utf-8') > else: > return st > > def utf82unicode(st): > """encode st into utf-8""" > return st.decode('utf-8') > > > def encode_modlist(modlist, no_op): > """encode ldap modlist structure > set no_op=3DTrue for Tuple of kind (int,str,[str,...]) > and False for (str, [str,...]) > """ > > for i, mod in enumerate(modlist): > if no_op: > attr_name, attr_values=3Dmod > else: > op, attr_name, attr_values=3Dmod > > attr_name=3Dunicode2utf8(attr_name) > if isinstance(attr_values, ( types.ListType, types.TupleType)): > attr_values=3Dmap(unicode2utf8, attr_values) > else: > attr_values=3Dunicode2utf8(attr_values) > if no_op: > modlist[i]=3D(attr_name, attr_values) > else: > modlist[i]=3D(op, attr_name, attr_values) > > return modlist > > class UnicodeLDAPObject(ldap.ldapobject.LDAPObject): > > expiration_delay=3D300 > > def __init__(self, uri, **kwargs): > ldap.ldapobject.LDAPObject.__init__(self, uri, **kwargs) > self.unicode_decoder=3D{} # (msgid, expiration, decoder_data) > # I use an expiration time to avoid the list to become to big whe= n > the > # server don't answere any request > > def search_ext(self,base,scope, filterstr, attrlist, *args, **kwargs)= : > # base,scope, > filterstr=3D'(objectClass=3D*)',attrlist=3DNone,attrsonly=3D0,serverctrls= =3DNone,clientctrls=3DNone,timeout=3D-1,sizelimit=3D0 > > > # convert filter > filterstr=3Dunicode2utf8(filterstr) > > # convert arglist and keep a copy of original values for later > decoding > > u_attrlist=3Dattrlist > decoder=3D{} > if u_attrlist!=3DNone: > attrlist=3D[] > for attr in u_attrlist: > if isinstance(attr, types.UnicodeType): > attr=3Dattr.encode('utf-8') > # print 'ATTR', attr > decoder[attr]=3DTrue > attrlist.append(attr) > > msgid=3Dldap.ldapobject.LDAPObject.search_ext(self,base,scope, > filterstr, attrlist, *args, **kwargs) > > if decoder: > timeout=3Dkwargs.get('timeout', None) > if timeout=3D=3DNone or timeout<=3D0: > timeout=3Dself.expiration_delay > self.unicode_decoder[msgid]=3D(msgid, datetime.datetime.now()= +datetime.timedelta(seconds=3Dtimeout), decoder) > return msgid > > def result3(self, *args, **kwargs): > # kwargs=3D(self, msgid=3D_ldap.RES_ANY,all=3D1,timeout=3DNone): > rtype, rdata, rmsgid, decoded_serverctrls=3D > ldap.ldapobject.LDAPObject.result3(self, *args, **kwargs) > > if self.unicode_decoder.has_key(rmsgid): > msgid, expire, decoder=3Dself.unicode_decoder[rmsgid] > if rtype not in [ ldap.RES_SEARCH_ENTRY, > ldap.RES_SEARCH_REFERENCE ]: > # this was the last result > del self.unicode_decoder[rmsgid] > else: > # reset the timeout > timeout=3D kwargs.get('timeout', None) > if timeout=3D=3DNone or timeout<=3D0: > timeout=3Dself.expiration_delay > self.unicode_decoder[msgid]=3D(msgid, datetime.datetime.n= ow()+datetime.timedelta(seconds=3Dtimeout), > decoder) > > # now decode the result > if rdata: > if rtype in [ldap.RES_SEARCH_ENTRY, > ldap.RES_SEARCH_REFERENCE, ldap.RES_SEARCH_RESULT]: > # FIXME: I dont know what is a RES_SEARCH_REFERENCE > rdata_u=3D[] > for i, (dn, attrs) in enumerate(rdata): > # FIXME: should I handle the 'dn' the same way > if decoder.has_key ('dn'): > dn=3Dutf82unicode(dn) > for key in attrs.keys(): > if decoder.has_key(key): > attrs[key]=3Dmap(utf82unicode, attrs[key]= ) > # print '\tITEM=3D', dn, attrs > rdata[i]=3D(dn, attrs) > > else: > # no decoder for this =3D> nothing to decode > pass > > # remove other expired decoder info > now=3Ddatetime.datetime.now() > for msgid in self.unicode_decoder.keys(): > if self.unicode_decoder[rmsgid][1]<now: > del self.unicode_decoder[rmsgid] > > return rtype, rdata, rmsgid, decoded_serverctrls > > def add_ext(self, dn, modlist, *args, **kwargs): > # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) > dn=3Dunicode2utf8(dn) > # print 'MODLIST', modlist > modlist=3Dencode_modlist(modlist, True) > # print 'MODLIST unicode', modlist > return ldap.ldapobject.LDAPObject.add_ext (self, dn, modlist, > *args, **kwargs) > > def modify_ext(self, dn, modlist, *args, **kwargs): > # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) > dn=3Dunicode2utf8(dn) > # print 'MODLIST', modlist > modlist=3Dencode_modlist(modlist, False) > # print 'MODLIST unicode', modlist > return ldap.ldapobject.LDAPObject.modify_ext(self, dn, modlist, > *args, **kwargs) > > def delete_ext(self, dn, *args, **kwargs): > # args=3D(self,dn,serverctrls=3DNone,clientctrls=3DNone) > dn=3Dunicode2utf8(dn) > return ldap.ldapobject.LDAPObject.delete_ext(self, dn, *args, > **kwargs) > > > > def print_ldap_result(ldap_result): > for dn, item in ldap_result: > print 'DN=3D', repr(dn) > for k, v in item.iteritems(): > print '\t%s: %s' % (k, repr(v)) > print > > ldap_url=3Dldapurl.LDAPUrl ('ldap://%s:%d/%s' % (host, port, base_dn)) > ldap_url.applyDefaults({ > 'who': who, > 'cred' : cred, }) > #l=3Dldap.ldapobject.LDAPObject(ldap_url.initializeUrl()) > l=3DUnicodeLDAPObject(ldap_url.initializeUrl()) > l.simple_bind_s(ldap_url.who, ldap_url.cred) > print 'Connected as', l.whoami_s() > > > first_name=3D'Michael' > first_name2=3Du'Micha\xebl' > last_name=3Du'Str\xf6der' > email=3D' mi...@st...' > street=3Du'Hauptstra\xe1e' > country=3D'Germany' > > cn=3D'%s %s' %(first_name, last_name) > dn=3D'cn=3D%s,%s' %(cn, base_dn) > info=3D{ > u'cn' : (cn, ), > 'mail' : (email, ), > 'objectClass' : ('top', 'inetOrgPerson', 'kolabInetOrgPerson',), > u'sn' : (last_name, ), > u'givenName' : (first_name, ), > u'street': (street, ), > 'c': (country, ), > 'telephoneNumber': '+49 1111111111', > } > > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL , '(cn=3D%s)' % (cn= ,) , > info.keys()) > if ldap_result: > print '=3D=3D Found' > print_ldap_result(ldap_result) > l.delete_s(dn) > print '=3D=3D Deleted' > > l.add_s(dn, ldap.modlist.addModlist (info)) > print '=3D=3D Created' > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,= ) , > info.keys()) > print_ldap_result(ldap_result) > > l.modify_s(dn, [(ldap.MOD_REPLACE, u'givenName', first_name2), > (ldap.MOD_ADD, 'telephoneNumber', ( '+49 1234567890', )), > ]) > > print '=3D=3DModified' > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,= ) , > info.keys()) > print_ldap_result(ldap_result) > > print '=3D=3DDisplay once more' > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (cn,= ) , > ['*', '+', u'dn', u'givenName', u'creatorsName'] ) > print_ldap_result(ldap_result) > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > Connected as dn:cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc > =3D=3D Found > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: ['mi...@st...'] > givenName: [u'Micha\xebl'] > > =3D=3D Deleted > =3D=3D Created > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: ['mi...@st...'] > givenName: [u'Michael'] > > =3D=3DModified > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: [' mi...@st...'] > givenName: [u'Micha\xebl'] > > =3D=3DDisplay more > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > cn: [u'Michael Str\xf6der'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > street: [u'Hauptstra\xe1e'] > sn: [u'Str\xf6der'] > mail: ['mi...@st...'] > givenName: [u'Micha\xebl'] > > =3D=3DDisplay once more > DN=3D u'cn=3DMichael Str\xf6der,dc=3Dasxnet,dc=3Dloc' > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > c: ['Germany'] > entryCSN: ['20070524191126Z#000002#00#000000'] > cn: ['Michael Str\xc3\xb6der'] > entryDN: ['cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc'] > createTimestamp: ['20070524191126Z'] > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > creatorsName: [u'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc'= ] > entryUUID: ['5099e82e-9e76-102b-830b-0da78c7bd35e'] > hasSubordinates: ['FALSE'] > modifiersName: ['cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc'= ] > street: ['Hauptstra\xc3\xa1e'] > sn: ['Str\xc3\xb6der'] > structuralObjectClass: ['inetOrgPerson'] > subschemaSubentry: ['cn=3DSubschema'] > mail: [' mi...@st...'] > givenName: [u'Micha\xebl'] > modifyTimestamp: ['20070524191126Z'] > > > > > > -- > -- > Alain Spineux > aspineux gmail com > May the sources be with you > --=20 -- Alain Spineux aspineux gmail com May the sources be with you |
From: Alain S. <asp...@gm...> - 2007-05-31 12:25:57
|
Hi I changed my code to make it more modular. class UnicodeLDAPInterface: ldapbaseclass=3DNone .... my code goes HERE class UnicodeLDAPObject(UnicodeLDAPInterface, LDAPObject): ldapbaseclass=3DLDAPObject class UnicodeReconnectLDAPObject(UnicodeLDAPInterface, ReconnectLDAPObject)= : ldapbaseclass=3DReconnectLDAPObject Here is the full code #!/usr/bin/env python2.4 import types, datetime import ldap, ldapurl, ldap.modlist from ldap.ldapobject import LDAPObject from ldap.ldapobject import ReconnectLDAPObject def unicode2utf8(st): """Convert unicode (and only unicode) string into utf-8 raw string as expected by ldap""" if isinstance(st, types.UnicodeType): return st.encode('utf-8') else: return st def utf82unicode(st): """encode st into utf-8""" return st.decode('utf-8') def encode_modlist(modlist, no_op): """encode ldap modlist structure set no_op=3DTrue for Tuple of kind (int,str,[str,...]) and False for (str, [str,...]) """ for i, mod in enumerate(modlist): if no_op: attr_name, attr_values=3Dmod else: op, attr_name, attr_values=3Dmod attr_name=3Dunicode2utf8(attr_name) if isinstance(attr_values, (types.ListType, types.TupleType)): attr_values=3Dmap(unicode2utf8, attr_values) else: attr_values=3Dunicode2utf8(attr_values) if no_op: modlist[i]=3D(attr_name, attr_values) else: modlist[i]=3D(op, attr_name, attr_values) return modlist def _print_ldap_result(ldap_result): for dn, item in ldap_result: print 'DN=3D', repr(dn) for k, v in item.iteritems(): print '\t%s: %s' % (k, repr(v)) print class UnicodeLDAPInterface: ldapbaseclass=3DNone decoder_expiration_delay=3D300 # the expiration delay for an object in self.unicode_decoder def __init__(self, uri, **kwargs): self.ldapbaseclass.__init__(self, uri, **kwargs) self.unicode_decoder=3D{} # { (msgid, expiration, decoder_data) ...= } # I use an expiration time to avoid the list to become to big when the # server don't answere some requests def _set_unicode_decoder(self, msgid, value): """protect unicode_decoder against multi-threading update or add the decoder """ self._ldap_object_lock.acquire() try: self.unicode_decoder[msgid]=3Dvalue finally: self._ldap_object_lock.release() def _remove_unicode_decoder(self, msgid): """protect unicode_decoder against multi-threading remove the decoder """ self._ldap_object_lock.acquire() try: try: del self.unicode_decoder[msgid] except: # ignore any errors pass finally: self._ldap_object_lock.release() def _get_unicode_decoder(self, msgid): """protect unicode_decoder against multi-threading read the decoder info for msgid """ self._ldap_object_lock.acquire() try: return self.unicode_decoder[msgid] finally: self._ldap_object_lock.release() def _expire_unicode_decoder(self): """cleanup any expired decoder""" self._ldap_object_lock.acquire() now=3Ddatetime.datetime.now() for msgid in self.unicode_decoder.keys(): if self.unicode_decoder[msgid][1]<now: del self.unicode_decoder[msgid] self._ldap_object_lock.release() def search_ext(self,base,scope, filterstr, attrlist, *args, **kwargs): # base,scope, filterstr=3D'(objectClass=3D*)',attrlist=3DNone,attrsonly=3D0,serverctrls= =3DNone,clientctrls=3DNone,timeout=3D-1,sizelimit=3D0 # convert filter filterstr_u=3Dunicode2utf8(filterstr) # convert arglist and keep a copy of original values for later decoding attrlist_u=3D[] decoder=3D{} # will keep only fields to decode if attrlist!=3DNone: for attr in attrlist: if isinstance(attr, types.UnicodeType): attr=3Dattr.encode('utf-8') decoder[attr]=3DTrue attrlist_u.append(attr) msgid=3Dself.ldapbaseclass.search_ext(self,base,scope, filterstr_u, attrlist_u, *args, **kwargs) if decoder: timeout=3Dkwargs.get('timeout', None) if timeout=3D=3DNone or timeout<=3D0: timeout=3Dself.decoder_expiration_delay self._set_unicode_decoder(msgid,(msgid, datetime.datetime.now()+datetime.timedelta(seconds=3Dtimeout), decoder)) return msgid def result3(self, *args, **kwargs): # kwargs=3D(self, msgid=3D_ldap.RES_ANY,all=3D1,timeout=3DNone): rtype, rdata, rmsgid, decoded_serverctrls=3Dself.ldapbaseclass.result3(self, *args, **kwargs) try: msgid, expire, decoder=3Dself._get_unicode_decoder(rmsgid) except KeyError: pass # no decoder for this =3D> nothing to decode else: if rtype not in [ ldap.RES_SEARCH_ENTRY, ldap.RES_SEARCH_REFERENCE ]: # this was the last result self._remove_unicode_decoder(rmsgid) else: # reset the timeout timeout=3Dkwargs.get('timeout', None) if timeout=3D=3DNone or timeout<=3D0: timeout=3Dself.expiration_delay self._set_unicode_decoder(msgid, (msgid, datetime.datetime.now()+datetime.timedelta(seconds=3Dtimeout), decoder)) # now decode the result if rdata: if rtype in [ldap.RES_SEARCH_ENTRY, ldap.RES_SEARCH_REFERENCE, ldap.RES_SEARCH_RESULT]: # FIXME: I dont know what is a RES_SEARCH_REFERENCE rdata_u=3D[] for i, (dn, attrs) in enumerate(rdata): # FIXME: should I handle the 'dn' the same way if decoder.has_key('dn'): dn=3Dutf82unicode(dn) for key in attrs.keys(): if decoder.has_key(key): attrs[key]=3Dmap(utf82unicode, attrs[key]) # print '\tITEM=3D', dn, attrs rdata[i]=3D(dn, attrs) self._expire_unicode_decoder() return rtype, rdata, rmsgid, decoded_serverctrls def add_ext(self, dn, modlist, *args, **kwargs): # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) dn=3Dunicode2utf8(dn) # print 'MODLIST', modlist modlist=3Dencode_modlist(modlist, True) # print 'MODLIST unicode', modlist return self.ldapbaseclass.add_ext(self, dn, modlist, *args, **kwargs) def modify_ext(self, dn, modlist, *args, **kwargs): # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone) dn=3Dunicode2utf8(dn) # print 'MODLIST', modlist modlist=3Dencode_modlist(modlist, False) # print 'MODLIST unicode', modlist return self.ldapbaseclass.modify_ext(self, dn, modlist, *args, **kwargs) def delete_ext(self, dn, *args, **kwargs): # args=3D(self,dn,serverctrls=3DNone,clientctrls=3DNone) dn=3Dunicode2utf8(dn) return self.ldapbaseclass.delete_ext(self, dn, *args, **kwargs) def abandon_ext(self, msgid, *args, **kwargs): # args=3D(self,msgid,serverctrls=3DNone,clientctrls=3DNone) result=3Dself.ldapbaseclass.abandon_ext(self, msgid, *args, **kwarg= s) self._remove_unicode_decoder(msgid) return result def cancel_ext(self, cancelid, *args, **kwargs): # args=3D(self,msgid,serverctrls=3DNone,clientctrls=3DNone) result=3Dself.ldapbaseclass.cancel_ext(self, cancelid, *args, **kwargs) self._remove_unicode_decoder(cancelid) return result class UnicodeLDAPObject(UnicodeLDAPInterface, LDAPObject): ldapbaseclass=3DLDAPObject class UnicodeReconnectLDAPObject(UnicodeLDAPInterface, ReconnectLDAPObject)= : ldapbaseclass=3DReconnectLDAPObject if __name__=3D=3D'__main__': import sys, os, time host=3D'localhost' port=3D389 base_dn=3D'dc=3Dasxnet,dc=3Dloc' if True: who=3D'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' cred=3D'********' else: who=3D'cn=3Dnobody,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' cred=3D'iMmTWz5pJ+lwY7i6M/BU61ngo1aBLyqQhRrrKbEc' ldap_url=3Dldapurl.LDAPUrl('ldap://%s:%d/%s' % (host, port, base_dn)) ldap_url.applyDefaults({ 'who': who, 'cred' : cred, }) print ldap_url #l=3DLDAPObject(ldap_url.initializeUrl()) #l=3DUnicodeLDAPObject(ldap_url.initializeUrl()) l=3DUnicodeReconnectLDAPObject(ldap_url.initializeUrl()) l.simple_bind_s(ldap_url.who, ldap_url.cred) print 'Connected as', l.whoami_s() first_name=3D'Michael' first_name2=3Du'Micha\xebl' last_name=3Du'Str\xf6der' email=3D'mi...@st...' street=3Du'Hauptstra\xe1e' country=3D'Germany' cn=3D'%s %s' %(first_name, last_name) dn=3D'cn=3D%s,%s' %(cn, base_dn) info=3D{ u'cn' : (cn, ), 'mail' : (email, ), 'objectClass' : ('top', 'inetOrgPerson', 'kolabInetOrgPerson',), u'sn' : (last_name, ), u'givenName' : (first_name, ), u'street': (street, ), 'c': (country, ), 'telephoneNumber': '+49 1111111111', } ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (c= n,) , info.keys()) if ldap_result: print '=3D=3D Found' _print_ldap_result(ldap_result) l.delete_s(dn) print '=3D=3D Deleted' l.add_s(dn, ldap.modlist.addModlist(info)) print '=3D=3D Created' ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (c= n,) , info.keys()) _print_ldap_result(ldap_result) l.modify_s(dn, [(ldap.MOD_REPLACE, u'givenName', first_name2), (ldap.MOD_ADD, 'telephoneNumber', ( '+49 1234567890', )), ]) print '=3D=3DModified' ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (c= n,) , info.keys()) _print_ldap_result(ldap_result) print '=3D=3DDisplay once more' ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (c= n,) , ['*', '+', u'dn', u'givenName', u'creatorsName'] ) _print_ldap_result(ldap_result) On 5/29/07, Alain Spineux <asp...@gm...> wrote: > > Hi Michael > > When investigating about python and unicode, I read somewhere (in a PEP > I thing) that python functions should accept and manage unicode string > as well as normal string.Of course if these strings could contains user > readable characters. > > This is not the case for python-ldap functions. Sometime when calling > python-ldap, we don't know well the origin ( a function is not suposed > to know its caller :-) of the arguments we are using : user input, > web interface, mysql database, ldap result, web form, > literal, text file, text parsing... some are unicode, other not. > I thing python-ldap function must accept unicode arguments. > > As we discussed at length previously, the decoding of the result is > less easy because, the library cannot guess alone the meaning of > these values. > > I'm not supporting the idea of downloading and use the ldap > schema. I cannot imaging a connection less application > like a web application doing that at any request! Or keeping a cache > for the schema ... > > Anyway I see 2 solutions > > 1. Let result() return non unicode strings. _HERE_ The user know all > returned > strings are normal strings utf-8 encoded and he can do the encoding > himself. A helper function doing the job for the result structure > should be welcome. > > 2. Do the conversion regarding the info provided in the query, as my > source sample does. > > I answer now some of your previous comment: > > > > In this case maybe is it possible to use [ '*', u'givenName', u'sn' ] > > > to convert only 'givenName' and 'sn' > > > But then you will not gain much! Still the application has to know whic= h > > attributes have to be converted. =3D> It's not worth hiding the convers= ion > > > within python-ldap. > > I don't really hide the conversion, because the user has to request it > using > unicode field name. And second, I do more work: I keep a link between > the msgid and the request to know with fields I have to convert and > also destroy the link when unneeded anymore. > > > The only clean solution would be something involving LDAP schema > processing! > > You know better than me how costly it is, in developing time, and its > overhead for CPU and network load. > Do you really consider to add the schema processing for unicode > integration in the future? Or are you > hoping that someone will send you a patch :-) ? > > > I know you were not very exited by my ideas, anyway the unicode support > for > argument encoding is important. (this is my opinion) > Feel free to suggest some cosmetic changes: function name, class name, th= e > > way I wrap your base class ..... > > Keep in mind, none of my code break compatibility with existing > application. > > Best regards. > > > > On 5/24/07, Alain Spineux <asp...@gm...> wrote: > > > > > > > > On 5/24/07, Michael Str=F6der < mi...@st...> wrote: > > > > > > Alain Spineux wrote: > > > > > > > > Yes but what about unknown field type ? > > > > > > If you really want to dive into this look in directory > > > pylib/w2lapp/schema/ of web2ldap's source. It works for me but I did > > > not > > > consider this whole framework mature enough to be incorporated into > > > python-ldap. > > > > > > I dont want to look at the schema: > > > > Here are the sources and the results. > > I use your more appropriate name for unicode testing :-) > > > > > > #!/usr/bin/env python2.4 > > > > import sys, os, time > > import ldap, ldapurl, ldap.modlist > > import types > > import datetime > > > > host=3D'localhost' > > port=3D389 > > base_dn=3D'dc=3Dasxnet,dc=3Dloc' > > > > if True: > > who=3D'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' > > cred=3D''********' > > else: > > who=3D'cn=3Dnobody,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc' > > cred=3D'iMmTWz5pJ+lwY7i6M/BU61ngo1aBLyqQhRrrKbEc' > > > > > > def unicode2utf8(st): > > """Convert unicode (and only unicode) string into utf-8 raw string > > as expected by ldap""" > > > > if isinstance(st, types.UnicodeType ): > > return st.encode('utf-8') > > else: > > return st > > > > def utf82unicode(st): > > """encode st into utf-8""" > > return st.decode('utf-8') > > > > > > def encode_modlist(modlist, no_op): > > """encode ldap modlist structure > > set no_op=3DTrue for Tuple of kind (int,str,[str,...]) > > and False for (str, [str,...]) > > """ > > > > for i, mod in enumerate(modlist): > > if no_op: > > attr_name, attr_values=3Dmod > > else: > > op, attr_name, attr_values=3Dmod > > > > attr_name=3Dunicode2utf8(attr_name) > > if isinstance(attr_values, ( types.ListType, types.TupleType)): > > attr_values=3Dmap(unicode2utf8, attr_values) > > else: > > attr_values=3Dunicode2utf8(attr_values) > > if no_op: > > modlist[i]=3D(attr_name, attr_values) > > else: > > modlist[i]=3D(op, attr_name, attr_values) > > > > return modlist > > > > class UnicodeLDAPObject(ldap.ldapobject.LDAPObject): > > > > expiration_delay=3D300 > > > > def __init__(self, uri, **kwargs): > > ldap.ldapobject.LDAPObject.__init__(self, uri, **kwargs) > > self.unicode_decoder=3D{} # (msgid, expiration, decoder_data) > > # I use an expiration time to avoid the list to become to big > > when the > > # server don't answere any request > > > > def search_ext(self,base,scope, filterstr, attrlist, *args, > > **kwargs): > > # base,scope, > > filterstr=3D'(objectClass=3D*)',attrlist=3DNone,attrsonly=3D0,serverctr= ls=3DNone,clientctrls=3DNone,timeout=3D-1,sizelimit=3D0 > > > > > > # convert filter > > filterstr=3Dunicode2utf8(filterstr) > > > > # convert arglist and keep a copy of original values for later > > decoding > > > > u_attrlist=3Dattrlist > > decoder=3D{} > > if u_attrlist!=3DNone: > > attrlist=3D[] > > for attr in u_attrlist: > > if isinstance(attr, types.UnicodeType): > > attr=3Dattr.encode('utf-8') > > # print 'ATTR', attr > > decoder[attr]=3DTrue > > attrlist.append(attr) > > > > msgid=3Dldap.ldapobject.LDAPObject.search_ext(self,base,scope, > > filterstr, attrlist, *args, **kwargs) > > > > if decoder: > > timeout=3Dkwargs.get('timeout', None) > > if timeout=3D=3DNone or timeout<=3D0: > > timeout=3Dself.expiration_delay > > self.unicode_decoder[msgid]=3D(msgid, datetime.datetime.now= ()+datetime.timedelta(seconds=3Dtimeout), decoder) > > return msgid > > > > def result3(self, *args, **kwargs): > > # kwargs=3D(self, msgid=3D_ldap.RES_ANY,all=3D1,timeout=3DNone)= : > > rtype, rdata, rmsgid, decoded_serverctrls=3D > > ldap.ldapobject.LDAPObject.result3(self, *args, **kwargs) > > > > if self.unicode_decoder.has_key(rmsgid): > > msgid, expire, decoder=3Dself.unicode_decoder[rmsgid] > > if rtype not in [ ldap.RES_SEARCH_ENTRY, > > ldap.RES_SEARCH_REFERENCE ]: > > # this was the last result > > del self.unicode_decoder[rmsgid] > > else: > > # reset the timeout > > timeout=3D kwargs.get('timeout', None) > > if timeout=3D=3DNone or timeout<=3D0: > > timeout=3Dself.expiration_delay > > self.unicode_decoder[msgid]=3D(msgid, > > datetime.datetime.now()+datetime.timedelta(seconds=3Dtimeout), decoder) > > > > # now decode the result > > if rdata: > > if rtype in [ldap.RES_SEARCH_ENTRY, > > ldap.RES_SEARCH_REFERENCE, ldap.RES_SEARCH_RESULT]: > > # FIXME: I dont know what is a RES_SEARCH_REFERENCE > > rdata_u=3D[] > > for i, (dn, attrs) in enumerate(rdata): > > # FIXME: should I handle the 'dn' the same way > > if decoder.has_key ('dn'): > > dn=3Dutf82unicode(dn) > > for key in attrs.keys(): > > if decoder.has_key(key): > > attrs[key]=3Dmap(utf82unicode, attrs[ke= y]) > > > > # print '\tITEM=3D', dn, attrs > > rdata[i]=3D(dn, attrs) > > > > else: > > # no decoder for this =3D> nothing to decode > > pass > > > > # remove other expired decoder info > > now=3Ddatetime.datetime.now() > > for msgid in self.unicode_decoder.keys(): > > if self.unicode_decoder[rmsgid][1]<now: > > del self.unicode_decoder[rmsgid] > > > > return rtype, rdata, rmsgid, decoded_serverctrls > > > > def add_ext(self, dn, modlist, *args, **kwargs): > > # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone= ) > > dn=3Dunicode2utf8(dn) > > # print 'MODLIST', modlist > > modlist=3Dencode_modlist(modlist, True) > > # print 'MODLIST unicode', modlist > > return ldap.ldapobject.LDAPObject.add_ext (self, dn, modlist, > > *args, **kwargs) > > > > def modify_ext(self, dn, modlist, *args, **kwargs): > > # args=3D(self,dn,modlist,serverctrls=3DNone,clientctrls=3DNone= ) > > dn=3Dunicode2utf8(dn) > > # print 'MODLIST', modlist > > modlist=3Dencode_modlist(modlist, False) > > # print 'MODLIST unicode', modlist > > return ldap.ldapobject.LDAPObject.modify_ext(self, dn, modlist, > > *args, **kwargs) > > > > def delete_ext(self, dn, *args, **kwargs): > > # args=3D(self,dn,serverctrls=3DNone,clientctrls=3DNone) > > dn=3Dunicode2utf8(dn) > > return ldap.ldapobject.LDAPObject.delete_ext(self, dn, *args, > > **kwargs) > > > > > > > > def print_ldap_result(ldap_result): > > for dn, item in ldap_result: > > print 'DN=3D', repr(dn) > > for k, v in item.iteritems(): > > print '\t%s: %s' % (k, repr(v)) > > print > > > > ldap_url=3Dldapurl.LDAPUrl ('ldap://%s:%d/%s' % (host, port, base_dn)) > > ldap_url.applyDefaults({ > > 'who': who, > > 'cred' : cred, }) > > #l=3Dldap.ldapobject.LDAPObject(ldap_url.initializeUrl()) > > l=3DUnicodeLDAPObject(ldap_url.initializeUrl()) > > l.simple_bind_s(ldap_url.who, ldap_url.cred) > > print 'Connected as', l.whoami_s() > > > > > > first_name=3D'Michael' > > first_name2=3Du'Micha\xebl' > > last_name=3Du'Str\xf6der' > > email=3D' mi...@st...' > > street=3Du'Hauptstra\xe1e' > > country=3D'Germany' > > > > cn=3D'%s %s' %(first_name, last_name) > > dn=3D'cn=3D%s,%s' %(cn, base_dn) > > info=3D{ > > u'cn' : (cn, ), > > 'mail' : (email, ), > > 'objectClass' : ('top', 'inetOrgPerson', 'kolabInetOrgPerson',), > > u'sn' : (last_name, ), > > u'givenName' : (first_name, ), > > u'street': (street, ), > > 'c': (country, ), > > 'telephoneNumber': '+49 1111111111', > > } > > > > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL , '(cn=3D%s)' % (= cn,) > > , info.keys()) > > if ldap_result: > > print '=3D=3D Found' > > print_ldap_result(ldap_result) > > l.delete_s(dn) > > print '=3D=3D Deleted' > > > > l.add_s(dn, ldap.modlist.addModlist (info)) > > print '=3D=3D Created' > > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (c= n,) , > > info.keys()) > > print_ldap_result(ldap_result) > > > > l.modify_s(dn, [(ldap.MOD_REPLACE, u'givenName', first_name2), > > (ldap.MOD_ADD, 'telephoneNumber', ( '+49 1234567890', > > )), > > ]) > > > > print '=3D=3DModified' > > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (c= n,) , > > info.keys()) > > print_ldap_result(ldap_result) > > > > print '=3D=3DDisplay once more' > > ldap_result=3Dl.search_s(base_dn, ldap.SCOPE_ONELEVEL, '(cn=3D%s)' % (c= n,) , > > ['*', '+', u'dn', u'givenName', u'creatorsName'] ) > > print_ldap_result(ldap_result) > > > > > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > > Connected as dn:cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dloc > > =3D=3D Found > > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > > c: ['Germany'] > > cn: [u'Michael Str\xf6der'] > > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > > street: [u'Hauptstra\xe1e'] > > sn: [u'Str\xf6der'] > > mail: ['mi...@st...'] > > givenName: [u'Micha\xebl'] > > > > =3D=3D Deleted > > =3D=3D Created > > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > > telephoneNumber: ['+49 1111111111'] > > c: ['Germany'] > > cn: [u'Michael Str\xf6der'] > > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > > street: [u'Hauptstra\xe1e'] > > sn: [u'Str\xf6der'] > > mail: ['mi...@st... '] > > givenName: [u'Michael'] > > > > =3D=3DModified > > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > > c: ['Germany'] > > cn: [u'Michael Str\xf6der'] > > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > > street: [u'Hauptstra\xe1e'] > > sn: [u'Str\xf6der'] > > mail: [' mi...@st...'] > > givenName: [u'Micha\xebl'] > > > > =3D=3DDisplay more > > DN=3D 'cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc' > > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > > c: ['Germany'] > > cn: [u'Michael Str\xf6der'] > > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > > street: [u'Hauptstra\xe1e'] > > sn: [u'Str\xf6der'] > > mail: ['mi...@st...'] > > givenName: [u'Micha\xebl'] > > > > =3D=3DDisplay once more > > DN=3D u'cn=3DMichael Str\xf6der,dc=3Dasxnet,dc=3Dloc' > > telephoneNumber: ['+49 1111111111', '+49 1234567890'] > > c: ['Germany'] > > entryCSN: ['20070524191126Z#000002#00#000000'] > > cn: ['Michael Str\xc3\xb6der'] > > entryDN: ['cn=3DMichael Str\xc3\xb6der,dc=3Dasxnet,dc=3Dloc'] > > createTimestamp: ['20070524191126Z'] > > objectClass: ['top', 'inetOrgPerson', 'kolabInetOrgPerson'] > > creatorsName: [u'cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dlo= c'] > > entryUUID: ['5099e82e-9e76-102b-830b-0da78c7bd35e'] > > hasSubordinates: ['FALSE'] > > modifiersName: ['cn=3Dmanager,cn=3Dinternal,dc=3Dasxnet,dc=3Dlo= c'] > > street: ['Hauptstra\xc3\xa1e'] > > sn: ['Str\xc3\xb6der'] > > structuralObjectClass: ['inetOrgPerson'] > > subschemaSubentry: ['cn=3DSubschema'] > > mail: [' mi...@st...'] > > givenName: [u'Micha\xebl'] > > modifyTimestamp: ['20070524191126Z'] > > > > > > > > > > > > -- > > -- > > Alain Spineux > > aspineux gmail com > > May the sources be with you > > > > > > -- > -- > Alain Spineux > aspineux gmail com > May the sources be with you > --=20 -- Alain Spineux aspineux gmail com May the sources be with you |
From: <mi...@st...> - 2007-07-19 13:39:17
|
Alain, Alain Spineux wrote: > > When investigating about python and unicode, I read somewhere (in a PEP > I thing) that python functions should accept and manage unicode string > as well as normal string. Without knowing the PEP (reference?) I guess this affects functions which takes a string as an argument and process it directly returning a result. In context of python-ldap this would be directly applicable to the functions in modules ldap.dn and ldap.filter. The basic problem here is that for the sake of backward-compability to LDAPv2 the charset has to be passed around either. That's what I'm doing in web2ldap. > Of course if these strings could contains user > readable characters. Let's call that "textual strings". > Anyway I see 2 solutions > > 1. Let result() return non unicode strings. _HERE_ The user know all > returned > strings are normal strings utf-8 encoded and he can do the encoding > himself. A helper function doing the job for the result structure > should be welcome. > > 2. Do the conversion regarding the info provided in the query, as my > source sample does. > > I answer now some of your previous comment: > >> > In this case maybe is it possible to use [ '*', u'givenName', u'sn' ] >> > to convert only 'givenName' and 'sn' > >> But then you will not gain much! Still the application has to know which >> attributes have to be converted. => It's not worth hiding the conversion >> within python-ldap. > > I don't really hide the conversion, because the user has to request it using > unicode field name. I don't like this approach. The type of the attribute names is causing a type conversion side-effect. I don't consider this to be good design and I guess most Python developers would not expect something like this. Think about an application accidently passing in Unicode strings but is not really prepared to get the Unicode/string mix. > Do you really consider to add the schema processing for unicode > integration in the future? Nope. It's up to the application programmer, especially based on whether LDAPv2 support is still needed for a particular application or not. I consider python-ldap to be rather a low-level API. > Keep in mind, none of my code break compatibility with existing application. Generally I don't want to discourage people to work on something. But sorry, I won't add your code to python-ldap's Lib/. I hope you're not upset. My proposal would be to add it under Demo/ so you're work can be considered to be used by others. Or you can put it on your own web page (for further development) and I'll put a link to it on http://python-ldap.sourceforge.net/docs.shtml. Ciao, Michael. |
From: Alain S. <asp...@gm...> - 2007-07-20 12:52:35
Attachments:
unicodeldap.py
|
Hi First I'm not upset by anything. You are responsible to maintain the package in an healthy state. This also your responsibility to add or remove some features and then to maintain them. Thanks for doing that. As you have suggested, I made a class wrapper that keep both code as independent as possible, and I'm happy with that. Anyway I'have some comment about your answerer ... On 7/19/07, Michael Str=F6der <mi...@st...> wrote: > Alain, > > Alain Spineux wrote: > > > > When investigating about python and unicode, I read somewhere (in a PEP > > I thing) that python functions should accept and manage unicode string > > as well as normal string. > > Without knowing the PEP (reference?) I guess this affects functions > which takes a string as an argument and process it directly returning a > result. In context of python-ldap this would be directly applicable to > the functions in modules ldap.dn and ldap.filter. Unicode string in python are made in a way that let the developer use them in a complete transparent way. If the libraries are respecting this principle too, the developer can exchange data from different sources (user input, SQL, ldap ...) without never making any conversion. The problem is strings are also used for binary storage and LDAP don't make difference between both usage (no charset and unicode types like in SQL), only the developer know and can make the conversion. > > The basic problem here is that for the sake of backward-compability to > LDAPv2 the charset has to be passed around either. That's what I'm doing > in web2ldap. > > > Of course if these strings could contains user > > readable characters. > > Let's call that "textual strings". > > > Anyway I see 2 solutions > > > > 1. Let result() return non unicode strings. _HERE_ The user know all > > returned > > strings are normal strings utf-8 encoded and he can do the encoding > > himself. A helper function doing the job for the result structure > > should be welcome. > > > > 2. Do the conversion regarding the info provided in the query, as my > > source sample does. > > > > I answer now some of your previous comment: > > > >> > In this case maybe is it possible to use [ '*', u'givenName', u'sn' = ] > >> > to convert only 'givenName' and 'sn' > > > >> But then you will not gain much! Still the application has to know whi= ch > >> attributes have to be converted. =3D> It's not worth hiding the conver= sion > >> within python-ldap. > > > > I don't really hide the conversion, because the user has to request it = using > > unicode field name. > > I don't like this approach. The type of the attribute names is causing a > type conversion side-effect. I don't consider this to be good design and > I guess most Python developers would not expect something like this. > Think about an application accidently passing in Unicode strings but is > not really prepared to get the Unicode/string mix. Today passing unicode argument to ldap functions raise an exception, then no accidents is possible :-) On the other side, with unicode support, things could accidentally work as expected. But this is only speculation about witch inconvenient is the worst. > > > Do you really consider to add the schema processing for unicode > > integration in the future? > > Nope. It's up to the application programmer, especially based on whether > LDAPv2 support is still needed for a particular application or not. I > consider python-ldap to be rather a low-level API. > > > Keep in mind, none of my code break compatibility with existing applica= tion. > > Generally I don't want to discourage people to work on something. But > sorry, I won't add your code to python-ldap's Lib/. I hope you're not > upset. My proposal would be to add it under Demo/ so you're work can be > considered to be used by others. Or you can put it on your own web page > (for further development) and I'll put a link to it on > http://python-ldap.sourceforge.net/docs.shtml. I have no website today :-( Please use the last version in attachment. Regards > > Ciao, Michael. > --=20 -- Alain Spineux aspineux gmail com May the sources be with you |