From: <jph...@fr...> - 2007-02-19 08:08:47
|
Hi, I was used to work with phpldapadmin, then discovered luma. Now I need something more specific to my needs, and thus interested in python-ldap. Starting was really easy :-) Unfortunately, I discovered a bug in my application this week-end: I can not use 8 bits chars in search or add functions while I have no issue with these characters in Luma :-( It seems to be related to string encoding. Luma makes an intensive use of the "unicode" function. So I tried that piece of code but without success: >>> import ldap >>> CON=ldap.open('localhost.localdomain') >>> CON.simple_bind_s('','') >>> CON.search_s('ou=amis,dc=localdomain',ldap.SCOPE_SUBTREE,u'(cn=th') exceptions.UnicodeEncodeError Traceback (most recent call last) /home/jpht/informatique/programmation/python/scripts/under_development/<console> /usr/lib/python2.3/site-packages/ldap/ldapobject.py in search_s(self, base, scope, filterstr, attrlist, attrsonly) 466 467 def search_s(self,base,scope,filterstr='(objectClass=*)',attrlist=None,attrsonly=0): --> 468 return self.search_ext_s(base,scope,filterstr,attrlist,attrsonly,timeout=self.timeout) 469 470 def search_st(self,base,scope,filterstr='(objectClass=*)',attrlist=None,attrsonly=0,timeout=-1): /usr/lib/python2.3/site-packages/ldap/ldapobject.py in search_ext_s(self, base, scope, filterstr, attrlist, attrsonly, serverctrls, clientctrls, timeout, sizelimit) 459 460 def search_ext_s(self,base,scope,filterstr='(objectClass=*)',attrlist=None,attrsonly=0,serverctrls=None,clientctrls=None,timeout=-1,sizelimit=0): --> 461 msgid = self._ldap_call(self._l.search_ext,base,scope,filterstr,attrlist,attrsonly,serverctrls,clientctrls,timeout,sizelimit) 462 return self.result(msgid,all=1,timeout=timeout)[1] 463 /usr/lib/python2.3/site-packages/ldap/ldapobject.py in _ldap_call(self, func, *args, **kwargs) 92 try: 93 try: ---> 94 result = func(*args,**kwargs) 95 finally: 96 self._ldap_object_lock.release() UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 6: ordinal not in range(128) I went quickly through this list, but was not sure of understanding the answer ;-( Could someone help me? Jean-Philippe P.S.: I am running slapd 2.2.23 and python-ldap 2.0.4 (debian sarge) |
From: Bjorn O. G. <bjo...@it...> - 2007-02-19 08:18:15
|
jph...@fr...: > Hi, > > I was used to work with phpldapadmin, then discovered luma. Now I need something > more specific to my needs, and thus interested in python-ldap. > > Starting was really easy :-) Unfortunately, I discovered a bug in my application > this week-end: I can not use 8 bits chars in search or add functions while I > have no issue with these characters in Luma :-( It seems to be related to string > encoding. Luma makes an intensive use of the "unicode" function. So I tried that > piece of code but without success: > > >>> import ldap > >>> CON=ldap.open('localhost.localdomain') > >>> CON.simple_bind_s('','') > >>> CON.search_s('ou=amis,dc=localdomain',ldap.SCOPE_SUBTREE,u'(cn=th?)') <snip> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 6: > ordinal not in range(128) LDAP uses utf8 - so convert any umlauts or whatever to utf8 before doing a query. Mostly anything can be converted _to_ utf8, but not will convert back to the encoding you came from. def utf8_encode(str): return unicode(str, "iso-8859-1").encode("utf-8") def utf8_decode(str): return unicode(str, "utf-8").encode("iso-8859-1") Usage: filter = utf8_encode('cn=Bjørn Ove Grøtan') Just change from iso-8859-1 to whatever encoding you use by default. -- Regards Bjørn Ove Grøtan Luma Debian Maintainer |
From: <mi...@st...> - 2007-02-19 11:05:17
|
jph...@fr... wrote: > > I can not use 8 bits chars in search or add functions while I > have no issue with these characters in Luma :-( It seems to be related to string > encoding. Luma makes an intensive use of the "unicode" function. So I tried that > piece of code but without success: > >>>> import ldap >>>> CON=ldap.open('localhost.localdomain') >>>> CON.simple_bind_s('','') >>>> CON.search_s('ou=amis,dc=localdomain',ldap.SCOPE_SUBTREE,u'(cn=th�*)') You have to pass raw strings to python-ldap since the API does not handle Unicode strings automatically. So something like u'(cn=th�*)'.encode('utf-8'). Not sure what the � is in your e-mail though. Therefore I'd recommend not to directly write 8-bit chars into the Python source code. Rather use the escaping \x.. Complete example with my last name within Python shell and shell charset being UTF-8. Python 2.5 (r25:51908, Nov 27 2006, 19:14:46) [GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> ustr=unicode('Ströder','utf-8') >>> ustr u'Str\xf6der' >>> ustr.encode('utf-8') 'Str\xc3\xb6der' >>> Ciao, Michael. |
From: Jean-Philippe T. <jph...@fr...> - 2007-02-19 21:06:42
|
On Mon, 19 Feb 2007 12:04:46 +0100 Michael Ströder <mi...@st...> wrote: > jph...@fr... wrote: > > > > I can not use 8 bits chars in search or add functions while I > > have no issue with these characters in Luma :-( It seems to be related to string > > encoding. Luma makes an intensive use of the "unicode" function. So I tried that > > piece of code but without success: > > > >>>> import ldap > >>>> CON=ldap.open('localhost.localdomain') > >>>> CON.simple_bind_s('','') > >>>> CON.search_s('ou=amis,dc=localdomain',ldap.SCOPE_SUBTREE,u'(cn=th�*)') > > You have to pass raw strings to python-ldap since the API does not > handle Unicode strings automatically. > > So something like u'(cn=th�*)'.encode('utf-8'). > > Not sure what the � is in your e-mail though. Therefore I'd recommend > not to directly write 8-bit chars into the Python source code. Rather > use the escaping \x.. > > Complete example with my last name within Python shell and shell charset > being UTF-8. > > Python 2.5 (r25:51908, Nov 27 2006, 19:14:46) > [GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> ustr=unicode('Ströder','utf-8') > >>> ustr > u'Str\xf6der' > >>> ustr.encode('utf-8') > 'Str\xc3\xb6der' > >>> > > Ciao, Michael. > Thanks to you (and Bjorn ;-)). It works far better now. Life would have been far easier if English was full of strange chars :-) Jean-Philippe |