From: Wido D. <wid...@gm...> - 2005-01-10 14:13:51
|
Hi All, I've a problem with Luma when fetching the schema information from an Oracle Directory Server. Normal search operations work fine, but since I won't allow any editing of attributes without knowing the schema, I'm somehow stuck. Upgrading to 2.0.6 didn't solve the problem either :( Here's the relevant part of the backtrace I get: Traceback (most recent call last): File "/home/wido/src/luma/lib/luma/base/backend/ObjectClassAttributeInfo.py", line 83, in retrieveInfoFromServer subschemasubentry_dn,schema = ldap.schema.urlfetch(tmpUrl) File "/usr/lib/python2.3/site-packages/ldap/schema/subentry.py", line 415, in urlfetch parsed_sub_schema = ldap.schema.SubSchema(subschemasubentry_entry) File "/usr/lib/python2.3/site-packages/ldap/schema/subentry.py", line 51, in __init__ se_instance = se_class(attr_value) File "/usr/lib/python2.3/site-packages/ldap/schema/models.py", line 49, in __init__ d = extract_tokens(l,self.token_defaults) File "/usr/lib/python2.3/site-packages/ldap/schema/tokenizer.py", line 56, in extract_tokens assert l[0].strip()=="(" and l[-1].strip()==")",ValueError(l) AssertionError: ['(', '2.16.840.1.113894.5.101.1.1063', 'NAME', 'orclUMCTGroupConfig', 'DESC', 'Configuration name defined in the Media Service', 's', 'Application', 'Profile', ' SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )'] The value for tmpUrl, the url of the server, is something like this: tmpUrl = "ldap://my.oracle.server.com:389" If you aren't able to fix this bug only with the backtrace, I can give you the mail of the server admin, who may give you access to his machine. mfg. Wido -- Wido Depping ICQ: 51303067 AIM: wido3379 Jabber: wi...@ja... Blog: http://widoww.blogspot.com |
From: <mi...@st...> - 2005-01-12 08:01:24
|
Wido Depping wrote: > I've looked into this issue the last hours and have found a solution > to this. Unfortunately this means (from my perspective) that the old > split_tokens(string) function must be replaced. Thanks for diving into it. > Since the old split_token(string) function wasn't documented and not > well prepared for handling these special cases, I rewrote the > function. It can be found here: > http://home.tu-clausthal.de/~ifwd/luma/split_tokens.py I'm currently testing your code. > In the README file > is specified, that the code should be compatible with Python 1.5. My > code is compatible with this version. But it needs at least the import > of the string module, which is provided with 1.5 (Checked in the docs > from python.org). Well, ldap.schema is probably not compatible with Python 1.5. I'd really like to drop support for older Python versions but that's another issue. > In the current shape, the function is only 3% slower than the old > code, but works with all LDAP servers, including Oracle now. I will test. I'd be happy if you provide a LDIF dump of a sub schema entry of Oracle's OID. > But if we > could use string.split(aList), this algorithm would be 3 times faster > than the current one. I currently do not have the time to analyse this any further. Can you please explain why string.split(aList) can't be used within your implementation? > I leave it up to you, to integrate my function, or find your own > solution. I'd add your code to python-ldap provided you give away copyright for it to the python-ldap project. Please confirm (list Cc:-ed). Ciao, Michael. |
From: Wido D. <wid...@gm...> - 2005-01-12 21:43:27
|
On Wed, 12 Jan 2005 07:51:29 +0100, Michael Str=F6der <mi...@st...> wrote: > > In the current shape, the function is only 3% slower than the old > > code, but works with all LDAP servers, including Oracle now. >=20 > I will test. I'd be happy if you provide a LDIF dump of a sub schema > entry of Oracle's OID. I'll do that tomorrow. >=20 > > But if we > > could use string.split(aList), this algorithm would be 3 times faster > > than the current one. >=20 > I currently do not have the time to analyse this any further. Can you > please explain why string.split(aList) can't be used within your > implementation? Hmm, nothing. I may not have expressed myself clearly enough. I wasn't sure if you were following certain coding conventions, so i tried to do the algortihm as simple as possible and keeping the usage of the string module low. But since we can use this module, the part can be replaced by string.split(). If have a more general question about some things you do in the source code. At some points you assign a function from a module to a variable, like "string_split =3D string.split". I've asked myself, why you do this. Is it to improve speed? Your assignment would spare a lookup of split in the string module and be somehow faster. But some quick testing showed, that this trick only gains 0.3 seconds when splitting a string 1.000.000 times. > > I leave it up to you, to integrate my function, or find your own > > solution. >=20 > I'd add your code to python-ldap provided you give away copyright for it > to the python-ldap project. Please confirm (list Cc:-ed). I hereby grant the python-ldap project (http://python-ldap.sourceforge.net/) my copyright belonging to all parts of code, which is meant to be included into python-ldap. This includes all future commits, unless otherwise stated. =20 mfg. Wido --=20 Wido Depping ICQ: 51303067 AIM: wido3379 Jabber: wi...@ja... Blog: http://widoww.blogspot.com |
From: <mi...@st...> - 2005-01-20 09:45:58
Attachments:
test_tokenizer.py
|
Wido, Michael Ströder wrote: > >> Since the old split_token(string) function wasn't documented and not >> well prepared for handling these special cases, I rewrote the >> function. It can be found here: >> http://home.tu-clausthal.de/~ifwd/luma/split_tokens.py > > I'm currently testing your code. There seems to be something wrong with your version of function ldap.schema.tokenizer.split_tokens(). I've experienced some strange issues when accessing MS AD with web2ldap. Find attached a script I'm using for regression testing. It fails with your implementation. Maybe it helps you to sort out the problems. Please extend this script to also cover the Oracle-specific test cases and send it back to me. Hmm, the script should be also in Tests/Lib/ldap/schema/test_tokenizer.py of python-ldap's CVS. (Not sure, I'm off-line on the train while writing this.) Ciao, Michael. |
From: Wido D. <wid...@gm...> - 2005-01-24 17:22:40
Attachments:
split_tokens.py
test_tokenizer.py
|
On Thu, 20 Jan 2005 09:23:22 +0100, Michael Str=F6der <mi...@st...> wrote: > There seems to be something wrong with your version of function > ldap.schema.tokenizer.split_tokens(). I've experienced some strange > issues when accessing MS AD with web2ldap. >=20 > Find attached a script I'm using for regression testing. It fails with > your implementation. Maybe it helps you to sort out the problems. Please > extend this script to also cover the Oracle-specific test cases and send > it back to me. Hmm, the script should be also in > Tests/Lib/ldap/schema/test_tokenizer.py of python-ldap's CVS. > (Not sure, I'm off-line on the train while writing this.) >=20 > Ciao, Michael. Hi Michael, Attached are the updated testcase file and my improved split_tokens() funct= ion. This time I use string.split() and string.join() to benefit from their speed. The added checking in order to comply with the test cases has the result, that the new algorithm is only slightly faster than the old one. But now it should work with all known servers :) mfg. Wido --=20 Wido Depping ICQ: 51303067 AIM: wido3379 Jabber: wi...@ja... Blog: http://widoww.blogspot.com |
From: <mi...@st...> - 2005-01-27 16:46:11
|
Wido Depping wrote: > > Attached are the updated testcase file and my improved split_tokens() function. Hmm, I really dislike your constant keywordDict although I somewhat like the idea behind it. But this list of keywords is simply not sufficient. => I modified function ldap.schema.tokenizer.extract_tokens() to pass argument known_tokens down to your version of split_tokens() as new argument keywordDict. But now the test_tokenizer.py script does not work anymore although web2ldap and the script Demo/schema.py seems to work with it. Well, this surely needs more testing. Please remember to send me a LDIF file of Oracle's sub schema sub entry. Ciao, Michael. |
From: <mi...@st...> - 2005-02-25 17:56:30
|
Wido Depping wrote: > > Attached are the updated testcase file and my improved split_tokens() function. > This time I use string.split() and string.join() to benefit from their > speed. The added checking in order to comply with the test cases has > the result, that the new algorithm is only slightly faster than the > old one. But now it should work with all known servers :) Your patches were added to CVS. The test script I've checked in does not work although the new version of split_tokens() seems to work correctly. But I've committed it anyway to save the new test cases. Note that there was also a change necessary to ldap.schema.models: http://cvs.sourceforge.net/viewcvs.py/python-ldap/python-ldap/Lib/ldap/schema/models.py?r1=1.26&r2=1.27 (CVS changes might not yet be up-to-date in the web interface.) Ciao, Michael. |
From: <mi...@st...> - 2005-09-20 06:57:57
|
Wido, Michael Str=F6der wrote: > Wido Depping wrote: >=20 >> Attached are the updated testcase file and my improved split_tokens() >> function. >=20 > Your patches were added to CVS. This new implementation of split_tokens() deletes the spaces from DESC schema attributes. I looked at the code and the main problem is right at the beginning of the function: tokenList =3D s.split(" ") >From my understanding there's no solution to save the spaces in DESC when using this code. I think there could be a better solution based on using keywordDict to find the relevant sections in the string. Still there would be some issues detecting whether the keyword appears inside colons. Ciao, Michael. |
From: <mi...@st...> - 2005-09-20 12:12:46
|
Wido, Michael Str=F6der wrote: > Wido Depping wrote: >=20 >> Attached are the updated testcase file and my improved split_tokens() >> function. >=20 > Your patches were added to CVS. This new implementation of split_tokens() deletes the spaces from DESC schema attributes. I looked at the code and the main problem is right at the beginning of the function: tokenList =3D s.split(" ") From my understanding there's no solution to save the spaces in DESC when using this code. I think there could be a better solution based on using keywordDict to find the relevant sections in the string. Still there would be some issues detecting whether the keyword appears inside colons. Ciao, Michael. |