From: Ed . <ep...@ho...> - 2003-06-16 19:41:18
|
Hi, I was tuning an LDAP directory for a client last week and had cause to run some before and after benchmarks. Basically for a 3000 entry directory I wrote a python script which did the following: listed each entry using the filter (cn=*) using python-ldap and also invoking the shell to use the ldapsearch command. These were done twice: running all attributes an just returning the cn attribute did 3000 random lookups using (cn=exact-match), and then (cn=exact-match*) again using python-ldap and the ldapsearch command. The searches were run twice on unloaded machines, the first time to populate caches, the second time as a rough best-performance figure The findings were somewhat surprising. In the list whole directory search. ldap-search was generally and consistently at least 30% faster than python-ldap. I.e. these figures apply before and after tuning the directory. Remember the python searches are pre-bound while ldapsearch binds each time it is called. In the random lookup test, the performance figures were comparable but this compares calling python-ldap to do a search against spawning a shell, running ldpasearch, binding then doing the search, i.e. the command line search has a LOT more overhead. I'm happy to run some tests to identify the cause to see if we can fix it, any suggestions where to start? General conclusions from my tests: python-ldap has a suprising performance penalty searching is helped by having ample cache (doh!) returning 1 attribute is much faster than returning all of them (doh!) searching on indexed attributes helps a lot (doh!) Ed _________________________________________________________________ Stay in touch with absent friends - get MSN Messenger http://www.msn.co.uk/messenger |
From: <mi...@st...> - 2003-06-20 09:37:16
|
Ed . wrote: > > > I'm happy to run some tests to identify the cause to see if we can fix > it, any suggestions where to start? > > General conclusions from my tests: > > python-ldap has a suprising performance penalty Please tell us which versions of python-ldap and OpenLDAP you're using. And please post your Python code and OpenLDAP's indexing configuration. Ciao, Michael. |
From: <bjo...@it...> - 2003-08-04 16:44:24
|
Ed .: > Hi, >=20 > I was tuning an LDAP directory for a client last week and had cause to = run=20 > some before and after benchmarks. >=20 > Basically for a 3000 entry directory I wrote a python script which did = the=20 > following: >=20 > listed each entry using the filter (cn=3D*) using python-ldap and also=20 > invoking the shell to use the ldapsearch command. These were done twice= :=20 > running all attributes an just returning the cn attribute >=20 > did 3000 random lookups using (cn=3Dexact-match), and then (cn=3Dexact-= match*)=20 > again using python-ldap and the ldapsearch command. >=20 > The searches were run twice on unloaded machines, the first time to=20 > populate caches, the second time as a rough best-performance figure >=20 > The findings were somewhat surprising. >=20 > In the list whole directory search. ldap-search was generally and=20 > consistently at least 30% faster than python-ldap. I.e. these figures a= pply=20 > before and after tuning the directory. Remember the python searches are= =20 > pre-bound while ldapsearch binds each time it is called. >=20 > In the random lookup test, the performance figures were comparable but = this=20 > compares calling python-ldap to do a search against spawning a shell,=20 > running ldpasearch, binding then doing the search, i.e. the command lin= e=20 > search has a LOT more overhead. >=20 > I'm happy to run some tests to identify the cause to see if we can fix = it,=20 > any suggestions where to start? Just ran through some tests of my own. I have a function that leaps through one ou (ou=3Dusers,dc=3Dmydomain,dc=3Dcom), finds every entry wit= h objectclass=3DposixAccount and get the uid of that account using the function search_ext_s The result from the search is asigned a variable named 'res', and then the funcion quits. Execution time from this python-script is aprox. 1m30s. Running the same query with ldapsearch from OpenLDAP-package, the query executes, prints output to console and exits in some 30s or so.=20 Total amount of entries found with that objectClass is aprox. 27k. > General conclusions from my tests: >=20 > python-ldap has a suprising performance penalty >=20 > searching is helped by having ample cache (doh!) I'm using OpenLDAP 2.1.22 with BerkeleyDB 4.1.25 with latest patch. While using BDB as backend, OpenLDAP cannot handle caching - BDB has to. Found a mail on the openldap-software@-mailinglist describing this issue and how to setup caching for BDB. That did not help my ldap-search from withing python, nor with ldapsearch. > returning 1 attribute is much faster than returning all of them (doh!) Here I got help from a fellow worker... Well... when running ldapsearch (from OpenLDAP) and supply the argument '-S uid' for sorting, we got a result in about 40s. When applying the argument -S - it seems like ldapsearch fetches one by one into some kind of datastructure and sort it before viewing.=20 Now, it seems like the python-ldap module does the same thing. Fetching the result - one by one. Like ldapsearch does with -S <attr> as argument.=20 Is there any way to force search_ext_s to fetch all at once and not one by one other than changing the source-code to pyton-ldap? -- Regards Bj=F8rn Ove Gr=F8tan "Resistance is futile. You will be assimilated." |