You can subscribe to this list here.
| 2000 |
Jan
|
Feb
(34) |
Mar
(9) |
Apr
|
May
(2) |
Jun
(14) |
Jul
(67) |
Aug
(34) |
Sep
(5) |
Oct
(20) |
Nov
(22) |
Dec
(31) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(15) |
Feb
(16) |
Mar
(20) |
Apr
(13) |
May
(72) |
Jun
(42) |
Jul
(41) |
Aug
(11) |
Sep
(19) |
Oct
(67) |
Nov
(59) |
Dec
(57) |
| 2002 |
Jan
(74) |
Feb
(69) |
Mar
(34) |
Apr
(55) |
May
(47) |
Jun
(74) |
Jul
(116) |
Aug
(68) |
Sep
(25) |
Oct
(42) |
Nov
(28) |
Dec
(52) |
| 2003 |
Jan
(19) |
Feb
(18) |
Mar
(35) |
Apr
(49) |
May
(73) |
Jun
(39) |
Jul
(26) |
Aug
(59) |
Sep
(33) |
Oct
(56) |
Nov
(69) |
Dec
(137) |
| 2004 |
Jan
(276) |
Feb
(15) |
Mar
(18) |
Apr
(27) |
May
(25) |
Jun
(7) |
Jul
(13) |
Aug
(2) |
Sep
(2) |
Oct
(10) |
Nov
(27) |
Dec
(28) |
| 2005 |
Jan
(22) |
Feb
(25) |
Mar
(41) |
Apr
(17) |
May
(36) |
Jun
(13) |
Jul
(22) |
Aug
(12) |
Sep
(23) |
Oct
(6) |
Nov
(4) |
Dec
|
| 2006 |
Jan
(11) |
Feb
(3) |
Mar
(5) |
Apr
(22) |
May
(1) |
Jun
(10) |
Jul
(19) |
Aug
(7) |
Sep
(25) |
Oct
(23) |
Nov
(5) |
Dec
(27) |
| 2007 |
Jan
(25) |
Feb
(17) |
Mar
(44) |
Apr
(8) |
May
(33) |
Jun
(31) |
Jul
(42) |
Aug
(16) |
Sep
(12) |
Oct
(16) |
Nov
(23) |
Dec
(73) |
| 2008 |
Jan
(26) |
Feb
(6) |
Mar
(46) |
Apr
(17) |
May
(1) |
Jun
(44) |
Jul
(9) |
Aug
(34) |
Sep
(20) |
Oct
(2) |
Nov
(4) |
Dec
(16) |
| 2009 |
Jan
(14) |
Feb
(3) |
Mar
(45) |
Apr
(52) |
May
(34) |
Jun
(32) |
Jul
(24) |
Aug
(52) |
Sep
(22) |
Oct
(23) |
Nov
(19) |
Dec
(10) |
| 2010 |
Jan
(10) |
Feb
(13) |
Mar
(22) |
Apr
(9) |
May
(1) |
Jun
(1) |
Jul
(8) |
Aug
(9) |
Sep
(10) |
Oct
(1) |
Nov
(2) |
Dec
(3) |
| 2011 |
Jan
|
Feb
(18) |
Mar
(39) |
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Jens V. <je...@zo...> - 2002-06-27 11:45:39
|
On Wednesday, June 26, 2002, at 04:58 , Michael Str=F6der wrote:
> It's possible to make it somewhat simpler since we have a first =
result()=20
> call before the while loops.
>
> while all:
> while ldap_result[0] is None:
> if (timeout>=3D0) and (time.time()-start_time>timeout):
> self._ldap_call(self._l.abandon,msgid)
> raise _ldap.TIMELIMIT_EXCEEDED(
> "LDAP time limit (%d secs) exceeded." % (timeout)
> )
> time.sleep(0.0001)
> ldap_result =3D self._ldap_call(self._l.result,msgid,0,0)
> if ldap_result[1] is None:
> break
> all_results.extend(ldap_result[1])
> ldap_result =3D None,None
> return all_results
>
this simplified version seems to slow down my setup. all of a sudden i =
get=20
only 50-70% of my previous speed. here is a result set with the result=20=
method changed to the format shown above::
*** Read the RootDSE on same connection
50.432497 searches/second
*** Read the RootDSE on newly created connection without extra simple =
bind
49.699850 searches/second
*** Read the RootDSE on newly created connection with an extra simple =
bind
31.885130 searches/second
here's a result set with leif's version::
*** Read the RootDSE on same connection
100.785835 searches/second
*** Read the RootDSE on newly created connection without extra simple =
bind
95.549340 searches/second
*** Read the RootDSE on newly created connection with an extra simple =
bind
49.825837 searches/second
jens
|
|
From: Jens V. <je...@zo...> - 2002-06-27 11:42:24
|
michael,
you're right, that was my mistake. i was transcribing the values from =
the=20
screen on a second machine, it was not done via copy/paste. here's a=20
current copy/paste result::
*** Read the RootDSE on same connection
100.785835 searches/second
*** Read the RootDSE on newly created connection without extra simple =
bind
95.549340 searches/second
*** Read the RootDSE on newly created connection with an extra simple =
bind
49.825837 searches/second
jens
On Wednesday, June 26, 2002, at 04:35 , Michael Str=F6der wrote:
> Jens Vagelpohl wrote:
>> - with leif's "fix":
>> *** Read the RootDSE on same connection
>> 82.275864 searches/second
>> *** Read the RootDSE on newly created connection without extra simple =
bind
>> 85.599961 searches/second
>> *** Read the RootDSE on newly created connection with an extra simple =
bind
>> 47.726886 searches/second
>
> Strange that you get more performance in the second case with a new=20
> connection established each time. Is that always the same when you =
repeat=20
> the test?
>
> Ciao, Michael.
>
|
|
From: Jens V. <je...@zo...> - 2002-06-27 11:31:43
|
yikes no.... LOL jens On Wednesday, June 26, 2002, at 02:19 , Michael Str=F6der wrote: > Derrick, Jens, > > did you read the posting sent by Mauro about the reverse NETBIOS name=20= > lookups? Are you by any chance in a Win32 DNS/WINS environment? > > Ciao, Michael. |
|
From: <mi...@st...> - 2002-06-27 06:47:18
|
Jens Vagelpohl wrote: > > - with leif's "fix": > > *** Read the RootDSE on same connection > 82.275864 searches/second > *** Read the RootDSE on newly created connection without extra simple bind > 85.599961 searches/second > *** Read the RootDSE on newly created connection with an extra simple bind > 47.726886 searches/second Strange that you get more performance in the second case with a new connection established each time. Is that always the same when you repeat the test? Ciao, Michael. |
|
From: <mi...@st...> - 2002-06-27 06:47:17
|
Franco Spinelli wrote: > I have installed openldap 2.1.2 fron openldap site > [..] > from _ldap import * >> ImportError: ld.so.1: /usr/local/bin/python: fatal: relocation error: >> file /usr/local/lib/python2.2/site-packages/_ldap.so: >> symbol ldap_url_search: referenced symbol not found Hmm, it seems that ldap_url_search is no longer available in OpenLDAP 2.1. So far I've never tested python-ldap with OpenLDAP 2.1. You have two options: * Use OpenLDAP 2.0.x to build python-ldap against that (see parameters library_dirs and include_dirs in setup.cfg to specify a separate OpenLDAP 2.0.x directory you used for OpenLDAP 2.0.x's ./configure --prefix) * Remove any reference to ldap_url_search in Modules/LDAPObject.c. web2ldap does not use this function. Ciao, Michael. |
|
From: <mi...@st...> - 2002-06-27 06:47:17
|
Derrick 'dman' Hudson wrote: > On Tue, Jun 25, 2002 at 08:28:15AM -0400, Jens Vagelpohl wrote: > | i have seen similar strange slowdowns **if i connect to localhost**. > | connecting to OpenLDAP on another host across the network was much faster > | (meaning: it worked at the expected speed). > > Actually, you're right. I never tried from a different host until you > said this. I presumed using the loopback interface would have the > least amount of delay due to eliminating network hardware from the > equation. Derrick, Jens, did you read the posting sent by Mauro about the reverse NETBIOS name lookups? Are you by any chance in a Win32 DNS/WINS environment? Ciao, Michael. |
|
From: <mi...@st...> - 2002-06-27 06:47:17
|
Leif Hedstrom wrote:
> Well, I don't know if this is the same problem I had with Python LDAP v1.x,
> we haven't tested v2.x yet. But, the result() function in Python LDAP can go
> into a very tight poll loop, with extreme effects if the Python process is
> running on the same machines as the LDAP server. Python will almost
> completely starve slapd for any CPU time ...
>
> Adding a short sleep() in the polling loop of ldapobject.result() helps, a
> lot.
Yes, you're right. A time.sleep(0.0001) right before the inner
result() call makes python-ldap hand over the CPU to the OS.
> while all:
> while ldap_result[0] is None:
> if (timeout>=0) and (time.time()-start_time>timeout):
> self._ldap_call(self._l.abandon,msgid)
> raise _ldap.TIMELIMIT_EXCEEDED(
> "LDAP time limit (%d secs) exceeded." % (timeout)
> )
> ldap_result = self._ldap_call(self._l.result,msgid,0,0)
> if ldap_result[0] is None:
> time.sleep(.01)
> if ldap_result[1] is None:
> break
It's possible to make it somewhat simpler since we have a first
result() call before the while loops.
while all:
while ldap_result[0] is None:
if (timeout>=0) and (time.time()-start_time>timeout):
self._ldap_call(self._l.abandon,msgid)
raise _ldap.TIMELIMIT_EXCEEDED(
"LDAP time limit (%d secs) exceeded." % (timeout)
)
time.sleep(0.0001)
ldap_result = self._ldap_call(self._l.result,msgid,0,0)
if ldap_result[1] is None:
break
all_results.extend(ldap_result[1])
ldap_result = None,None
return all_results
> Alternatively, adding
> an (arbitrarily) long timeout in the call to ldap_result() also accomplishes
> the same thing, like:
>
> ldap_result = self._ldap_call(self._l.result,msgid,0,15 * 60)
Which is a bad idea in a multi-threaded environment.
> I haven't dug deep into this problem yet, to figure out if this is an
> OpenLDAP library problem, or a Python LDAP problem.
The main problem here is that the OpenLDAP libs are not
thread-safe. Therefore a module-wide lock is needed to serialize
all calls into OpenLDAP libs. But just wrapping ldap_result() with
locking around it would lead to blocking threads if one thread is
waiting for large search results. That's why I implemented
LDAPObject.result() in Python like it is today. Unfortunately we
cannot deal with waiting for data at the socket level which leads
to higher-level polling loop.
Everybody is encouraged to try the time.sleep(0.0001) hack and
look how it "feels" now. BTW: it makes the simple benchmark I
posted yesterday slower. It always depends what you wanna
optimize... ;-)
I suspect this still might not fully explain the
30s-vs.-immediate-results reports. There may be more issues with
other effects like Mauro described.
I guess the best solution would be if somebody with some spare
cycles digs into OpenLDAP's libldap_r to check if it's ready for
use with python-ldap. This would make it possible to do a finer
grained locking on LDAP connections instead of module-wide locking
allowing different threads with different LDAPObject instances to
run without blocking each other.
Ciao, Michael.
|
|
From: Derrick 'd. H. <dm...@dm...> - 2002-06-26 18:21:22
|
On Wed, Jun 26, 2002 at 01:26:41AM +0200, Michael Str=F6der wrote:
| Derrick 'dman' Hudson wrote:
| >I tried using python-ldap today (1.9.999.pre04-1, python 2.1.3-3), but
| >it is way too inefficient. A simple search that results in 2 entries
| >returned takes 30 seconds.
|=20
| If that would be normal I would not use python-ldap. Let's see.
:-).
| >Watching with top shows nearly 100% CPU
| >usage for the 30 seconds, on an otherwise idle Athlon XP 1800+.
| >OpenLDAP (2.0) is running on that same machine, however using
| >ldapsearch or exim yields immediate results. =20
|=20
| Frankly this is a very unprecise performance measurement.
True, but the ability to measure the time by my watch is rather, umm,
noticeable :-).
| > '(mailGroupLocalPart=3D%s)' % listname ,
|=20
| Is attribute mailGroupLocalPart indexed?
No (AFAIK).
| An equality index should be sufficient here.
| > I need to do some integration of LDAP and some web-based programs,
| > and would like to work with python, but this sort of performance
| > hit just won't be usable.
|=20
| As you might have noticed I'm doing web programming with=20
| python-ldap. ;-) I'm using web2ldap for maintaining and searching=20
| my personal address book and it's pretty responsive when using a=20
| fast browser. I'm also browsing very large data sets (>150000).
That's good. I installed web2ldap and ldapexplorer yesterday to
evaluate them. web2ldap didn't look very useful --
modification/timestamp attributes are shown for any entry.
Actually, trying it again today, but using a different
host as the LDAP server (not localhost), it looks much more useful (it
shows all the attributes). I wonder why that is.
| Just to give you a clue what I'm actually doing with python-ldap=20
| in a commercial pilot project: I'm scanning 170000 entries in far=20
| less than an hour (mainly just reading the uid attribute). I'm=20
| doing diffing whole entries at a rate of 50 entries/second (some=20
| other work with a SQL DB is involved here).
Interesting. The "People" node on our tree has 500 child nodes, each
of which has no children.
| The process runs on a P-III 450 Mhz box against a 4-CPU, 1GB RAM
| server running iPlanet Directory Server 5.1.
I notice that your ldap server is not on the same machine as
python-ldap. As Jens mentioned, and I subsequently discovered for
myself, running python-ldap on a separate host from the ldap server
doesn't have the performance problem. I only experience the problem
with python-ldap and slapd on the same machine.
| > I'm willing to help with the code, if you point me
| >to the interesting parts (and help me learn the C API of python and
| >openldap as I go).
|=20
| To find out the interesting parts one has to do proper performance=20
| measuring.
:-).
=20
| And I would be really glad to see some *real* numbers. Please take=20
| this advice to produce numbers I can take serious:
Ok, here we go.
| 1. Eliminate all disk access =3D> turn off all logging.
Logging is (and was) off.
| 2. Eliminate caching issues =3D> do many searches, throw away first resul=
t.
Right. My earlier, crude, measurements were repeatable every time.
If it was a caching issue I would expect the first to be slow but not
the latter ones.
| 3. Eliminate DB backend issues =3D> only search RootDSE.
| (This hint by Kurt Zeilenga.)
DSE ... I don't think I've run across this TLA before.
=20
| 4. Maximize performance impact of python-ldap =3D> use faster LDAP server.
By "server" are you referring to hardware or software?
| I took some numbers on my P-III laptop against a locally installed=20
| Netscape Directory Server 4.16SP1 which is much faster than recent=20
| OpenLDAP.
Can I get Netscape Directory Server for Debian? Is it Free?
If not then there is very little possibility of using it.
| Test script is attached.
Thanks!
| There are three test cases especially for the guys who are blaming
| python-ldap for bad performance but are reconnecting to the LDAP
| server for each query. ;-)
Is it allowed to reconnect for each query if each query is run from a
separate process at disparate times? ;-)
Here is what I get :
Server :
Hardware :
Athlon XP 1800+ (1.5 GHz clock, I think)
256 MB DDR RAM
IDE disk (I don't know much on the specs, but it is
relatively new and fairly quick)
Software :
OpenLDAP 2.0.23 , ldbm backend
Debian woody/sid
Linux 2.4.18
Other Load :
light
Client 1 :
Hardware :
same
Software
(same)
OpenLDAP client library (v 2.0.23)
python 2.1.3
python-ldap 1.9.999.pre04
Other Load :
same
Client 2 :
Hardware :
Duron 750 (750 MHz clock)
256 MB PC133 SDRAM
Software
(same)
OpenLDAP client library (v 2.0.23)
python 2.1.3
python-ldap 1.9.999.pre04
Other Load :
light-moderate
Test 1 :
Client 1
*** Read the RootDSE on same connection
1719.493984 searches/second
*** Read the RootDSE on newly created connection without extra simple bind
30.879468 searches/second
*** Read the RootDSE on newly created connection with an extra simple bind
26.940053 searches/second
Watching with top shows the system working as hard as it could,
with the client eating most of the CPU and slapd using very very
little.
Test 2 :
Client 2
*** Read the RootDSE on same connection
895.992294 searches/second
*** Read the RootDSE on newly created connection without extra simple bind
161.131824 searches/second
*** Read the RootDSE on newly created connection with an extra simple bind
146.020832 searches/second
Watching both systems with top showed (CPU wise) the server not
even sweating while the client got a decent workout.
Test 3 :
Client 1 ,=20
without the async implementation of LDAPObject.result(),
directly wrapping the built-in C implementation:
*** Read the RootDSE on same connection
2039.669716 searches/second
*** Read the RootDSE on newly created connection without extra simple bind
279.827530 searches/second
*** Read the RootDSE on newly created connection with an extra simple bind
262.486823 searches/second
Watching with top shows _both_ python-ldap and slapd getting a
fair amount of CPU time, and the CPU was only running at ~50%
of its capacity (good things, IMO :-)).
Test 4 :
same as test 3 but with Client 2
*** Read the RootDSE on same connection
1069.199140 searches/second
*** Read the RootDSE on newly created connection without extra simple bind
163.548164 searches/second
*** Read the RootDSE on newly created connection with an extra simple bind
151.169328 searches/second
Similar observations in top, except that the server wasn't working
nearly as hard as in test3.=20
This seems to show that the difference in the result() method is more
significant/noticeable on a fast system than on a slow one.
| =3D> I have yet to see some serious numbers proving the "30 seconds=20
| vs. immediate results".
=20
If I add
index mailGroupLocalPart eq
to slapd.conf and restart the daemon, my script runs really fast on
"client1", but (incorrectly) doesn't return any results.
Using that script on "client2" yields these results (just for
comparison) :
real 0m0.147s
user 0m0.120s
sys 0m0.020s
This sort of time is quite acceptable :-).
Does this help?
Oh, BTW, some of the docs are missing (404) on the web site. For
example, go to=20
http://python-ldap.sourceforge.net/pydoc/ldap.html
and click on the "functions" link.
-D
--=20
The way of a fool seems right to him,
but a wise man listens to advice.
Proverbs 12:15
=20
http://dman.ddts.net/~dman/
|
|
From: <mi...@st...> - 2002-06-26 07:01:41
|
Derrick 'dman' Hudson wrote:
> I tried using python-ldap today (1.9.999.pre04-1, python 2.1.3-3), but
> it is way too inefficient. A simple search that results in 2 entries
> returned takes 30 seconds.
If that would be normal I would not use python-ldap. Let's see.
> Watching with top shows nearly 100% CPU
> usage for the 30 seconds, on an otherwise idle Athlon XP 1800+.
> OpenLDAP (2.0) is running on that same machine, however using
> ldapsearch or exim yields immediate results.
Frankly this is a very unprecise performance measurement. As I
wrote in one of my former postings there might be performance
issues due to the nature of the async reimplementation of
LDAPObject.result(). And it might look bad that your CPU meter
shows 100% load. But I also suspect a few other effects if you
just say "30 seconds vs. immediate results".
> '(mailGroupLocalPart=%s)' % listname ,
Is attribute mailGroupLocalPart indexed? An equality index should
be sufficient here. Also note that OpenLDAP 2's ldbm backend
implements server-side DB caching.
> $ time ./maillist.py AITP
> dann , derrick
>
> real 0m36.555s
> user 0m32.990s
> sys 0m3.490s
>
> $ time ldapsearch -x -h localhost -b "ou=People,o=International Teams"
> -s one "(mailGroupLocalPart=aitp)" "uid" | grep uid:
> uid: dann
> uid: derrick
>
> real 0m0.033s
> user 0m0.000s
> sys 0m0.010s
>
> I think this rather clearly points to python-ldap as the culprit.
Hmm, I have some doubts. Let's see.
> I
> need to do some integration of LDAP and some web-based programs, and
> would like to work with python, but this sort of performance hit just
> won't be usable.
As you might have noticed I'm doing web programming with
python-ldap. ;-) I'm using web2ldap for maintaining and searching
my personal address book and it's pretty responsive when using a
fast browser. I'm also browsing very large data sets (>150000).
Just to give you a clue what I'm actually doing with python-ldap
in a commercial pilot project: I'm scanning 170000 entries in far
less than an hour (mainly just reading the uid attribute). I'm
doing diffing whole entries at a rate of 50 entries/second (some
other work with a SQL DB is involved here). The process runs on a
P-III 450 Mhz box against a 4-CPU, 1GB RAM server running iPlanet
Directory Server 5.1.
> I'm willing to help with the code, if you point me
> to the interesting parts (and help me learn the C API of python and
> openldap as I go).
To find out the interesting parts one has to do proper performance
measuring.
And I would be really glad to see some *real* numbers. Please take
this advice to produce numbers I can take serious:
1. Eliminate all disk access => turn off all logging.
2. Eliminate caching issues => do many searches, throw away first
result.
Example: During my former expirements with large group entries
(>60000 member attribute values on OpenLDAP and 200000 on NS DS
4.16) I experienced a 30+ seconds interval when accessing the
group entry for the first time. Later the CompareRequest was done
much faster.
3. Eliminate DB backend issues => only search RootDSE.
(This hint by Kurt Zeilenga.)
4. Maximize performance impact of python-ldap => use faster LDAP
server.
I took some numbers on my P-III laptop against a locally installed
Netscape Directory Server 4.16SP1 which is much faster than recent
OpenLDAP. Test script is attached. There are three test cases
especially for the guys who are blaming python-ldap for bad
performance but are reconnecting to the LDAP server for each
query. ;-)
*** Read the RootDSE on same connection
279.630273 searches/second
*** Read the RootDSE on newly created connection without extra
simple bind
171.760053 searches/second
*** Read the RootDSE on newly created connection with an extra
simple bind
144.559940 searches/second
Now without the async implementation of LDAPObject.result()
directly wrapping the built-in C implementation:
*** Read the RootDSE on same connection
464.693726 searches/second
*** Read the RootDSE on newly created connection without extra
simple bind
248.286182 searches/second
*** Read the RootDSE on newly created connection with an extra
simple bind
202.098080 searches/second
The rates are obviously much higher (+40%..+65%) but note that we
did everything here to make python-ldap look really bad. Under
real-world conditions with real DB backend activity, server
logging on disk and whatever the percentage of the higher rate
looks surely different.
=> I have yet to see some serious numbers proving the "30 seconds
vs. immediate results".
Ciao, Michael.
|
|
From: <mi...@st...> - 2002-06-26 07:01:41
|
Jens Vagelpohl wrote: > i have seen similar strange slowdowns **if i connect to localhost**. > connecting to OpenLDAP on another host across the network was much > faster (meaning: it worked at the expected speed). the slowdown never > occurred on python-ldap 1.10.3. both were compiled against > OpenLDAP2-libraries. > > so far i tended to assume that the problem might be a local > configuration issue on my machine, though i haven't been able to find it > yet. Please, be more precise: Which OS? Which LDAP server and version? Which kind of queries? Logging? Ciao, Michael. |
|
From: Jens V. <je...@zo...> - 2002-06-25 20:02:50
|
leif, i will be darned :) your little hack does indeed work on my setup, too. a test script that took 30 seconds to run now finishes in 2 seconds... jens On Tuesday, June 25, 2002, at 03:00 , Leif Hedstrom wrote: > Well, I don't know if this is the same problem I had with Python LDAP v1. > x, > we haven't tested v2.x yet. But, the result() function in Python LDAP can > go > into a very tight poll loop, with extreme effects if the Python process is > running on the same machines as the LDAP server. Python will almost > completely starve slapd for any CPU time ... > > Adding a short sleep() in the polling loop of ldapobject.result() helps, a > lot. Like > > while all: > while ldap_result[0] is None: > if (timeout>=0) and (time.time()-start_time>timeout): > self._ldap_call(self._l.abandon,msgid) > raise _ldap.TIMELIMIT_EXCEEDED( > "LDAP time limit (%d secs) exceeded." % (timeout) > ) > ldap_result = self._ldap_call(self._l.result,msgid,0,0) > if ldap_result[0] is None: > time.sleep(.01) > if ldap_result[1] is None: > break > > (note the time.sleep() call if there was no result). Alternatively, adding > an (arbitrarily) long timeout in the call to ldap_result() also > accomplishes > the same thing, like: > > ldap_result = self._ldap_call(self._l.result,msgid,0,15 * 60) > > I haven't dug deep into this problem yet, to figure out if this is an > OpenLDAP library problem, or a Python LDAP problem. I just know that > preventing ldapobject.result() from going into the tight polling loop > solved > our problems. :-) > > Cheers, > > -- Leif > |
|
From: Leif H. <lei...@pr...> - 2002-06-25 19:00:51
|
Well, I don't know if this is the same problem I had with Python LDAP v1.x,
we haven't tested v2.x yet. But, the result() function in Python LDAP can go
into a very tight poll loop, with extreme effects if the Python process is
running on the same machines as the LDAP server. Python will almost
completely starve slapd for any CPU time ...
Adding a short sleep() in the polling loop of ldapobject.result() helps, a
lot. Like
while all:
while ldap_result[0] is None:
if (timeout>=0) and (time.time()-start_time>timeout):
self._ldap_call(self._l.abandon,msgid)
raise _ldap.TIMELIMIT_EXCEEDED(
"LDAP time limit (%d secs) exceeded." % (timeout)
)
ldap_result = self._ldap_call(self._l.result,msgid,0,0)
if ldap_result[0] is None:
time.sleep(.01)
if ldap_result[1] is None:
break
(note the time.sleep() call if there was no result). Alternatively, adding
an (arbitrarily) long timeout in the call to ldap_result() also accomplishes
the same thing, like:
ldap_result = self._ldap_call(self._l.result,msgid,0,15 * 60)
I haven't dug deep into this problem yet, to figure out if this is an
OpenLDAP library problem, or a Python LDAP problem. I just know that
preventing ldapobject.result() from going into the tight polling loop solved
our problems. :-)
Cheers,
-- Leif
|
|
From: Derrick 'd. H. <dm...@dm...> - 2002-06-25 17:20:43
|
On Tue, Jun 25, 2002 at 08:28:15AM -0400, Jens Vagelpohl wrote: | i have seen similar strange slowdowns **if i connect to localhost**.=20 | connecting to OpenLDAP on another host across the network was much faster= =20 | (meaning: it worked at the expected speed). Actually, you're right. I never tried from a different host until you said this. I presumed using the loopback interface would have the least amount of delay due to eliminating network hardware from the equation. That certainly gives some (initial) improvement possibilities. -D --=20 A)bort, R)etry, B)ang it with a large hammer =20 http://dman.ddts.net/~dman/ |
|
From: Jens V. <je...@zo...> - 2002-06-25 12:28:23
|
i have seen similar strange slowdowns **if i connect to localhost**.
connecting to OpenLDAP on another host across the network was much faster
(meaning: it worked at the expected speed). the slowdown never occurred on
python-ldap 1.10.3. both were compiled against OpenLDAP2-libraries.
so far i tended to assume that the problem might be a local configuration
issue on my machine, though i haven't been able to find it yet.
jens
On Monday, June 24, 2002, at 10:04 , Derrick 'dman' Hudson wrote:
>
> I tried using python-ldap today (1.9.999.pre04-1, python 2.1.3-3), but
> it is way too inefficient. A simple search that results in 2 entries
> returned takes 30 seconds. Watching with top shows nearly 100% CPU
> usage for the 30 seconds, on an otherwise idle Athlon XP 1800+.
> OpenLDAP (2.0) is running on that same machine, however using
> ldapsearch or exim yields immediate results.
>
> $ cat maillist.py
> #!/usr/bin/python2.1
>
> import sys
> import ldap
>
> INIT = "ldap://localhost/"
> BASE = "ou=People,o=International Teams"
> TIMEOUT = 40
>
> listname = sys.argv[1]
> ldapconn = ldap.initialize( INIT )
> results = ldapconn.search_st( BASE , ldap.SCOPE_ONELEVEL ,
> '(mailGroupLocalPart=%s)' % listname ,
> attrlist = ("uid",) , timeout = TIMEOUT
> )
> names = [ attrs['uid'][0] for dn , attrs in results]
> print ' , '.join( names )
>
>
> $ time ./maillist.py AITP
> dann , derrick
>
> real 0m36.555s
> user 0m32.990s
> sys 0m3.490s
>
> $ time ldapsearch -x -h localhost -b "ou=People,o=International Teams"
> -s one "(mailGroupLocalPart=aitp)" "uid" | grep uid:
> uid: dann
> uid: derrick
>
> real 0m0.033s
> user 0m0.000s
> sys 0m0.010s
>
>
>
> I think this rather clearly points to python-ldap as the culprit. I
> need to do some integration of LDAP and some web-based programs, and
> would like to work with python, but this sort of performance hit just
> won't be usable. I'm willing to help with the code, if you point me
> to the interesting parts (and help me learn the C API of python and
> openldap as I go).
>
> I did a quick glance over the archives and read this thread :
> http://www.geocrawler.com/lists/3/SourceForge/1568/0/8932688/
>
> -D
>
|
|
From: Derrick 'd. H. <dm...@dm...> - 2002-06-25 01:54:12
|
I tried using python-ldap today (1.9.999.pre04-1, python 2.1.3-3), but
it is way too inefficient. A simple search that results in 2 entries
returned takes 30 seconds. Watching with top shows nearly 100% CPU
usage for the 30 seconds, on an otherwise idle Athlon XP 1800+.
OpenLDAP (2.0) is running on that same machine, however using
ldapsearch or exim yields immediate results. =20
$ cat maillist.py
#!/usr/bin/python2.1
import sys
import ldap
INIT =3D "ldap://localhost/"
BASE =3D "ou=3DPeople,o=3DInternational Teams"
TIMEOUT =3D 40
listname =3D sys.argv[1]
ldapconn =3D ldap.initialize( INIT )
results =3D ldapconn.search_st( BASE , ldap.SCOPE_ONELEVEL ,
'(mailGroupLocalPart=3D%s)' % listname ,
attrlist =3D ("uid",) , timeout =3D TIMEOUT
)=20
names =3D [ attrs['uid'][0] for dn , attrs in results]
print ' , '.join( names )
$ time ./maillist.py AITP
dann , derrick
real 0m36.555s
user 0m32.990s
sys 0m3.490s
$ time ldapsearch -x -h localhost -b "ou=3DPeople,o=3DInternational Teams"
-s one "(mailGroupLocalPart=3Daitp)" "uid" | grep uid:=20
uid: dann
uid: derrick
real 0m0.033s
user 0m0.000s
sys 0m0.010s
I think this rather clearly points to python-ldap as the culprit. I
need to do some integration of LDAP and some web-based programs, and
would like to work with python, but this sort of performance hit just
won't be usable. I'm willing to help with the code, if you point me
to the interesting parts (and help me learn the C API of python and
openldap as I go).
I did a quick glance over the archives and read this thread :
http://www.geocrawler.com/lists/3/SourceForge/1568/0/8932688/
-D
--=20
How great is the love the Father has lavished on us,
that we should be called children of God!
1 John 3:1=20
=20
http://dman.ddts.net/~dman/
|
|
From: <mi...@st...> - 2002-06-24 09:20:46
|
Mauro Cicognini wrote: > I'm glad to announce that I've found the limiting factor for the 5-odd > seconds delay during binds, Note that with python-ldap 2.0.0pre04 ldap.initialize() is used in any case which wraps ldap_initialize in the OpenLDAP libs. ldap_initialize() has a slightly different behaviour than ldap_open(). AFAIK it just initializes the LDAP connection context but does not open the LDAP connection. The LDAP connection is opened when doing the first LDAPRequest - no matter which one. Having said this I'd like to see your test script. You might wanna check if the delay really happens during BindRequest or any first LDAPRequest. Note that LDAPv3 does not require you to send a BindRequest prior to other LDAPRequests. > By sniffing network traffic I saw that it wasn't LDAP's fault per se. In > fact, for some reason the current libraries (as opposed to the old UMich > libs that I used in PythonLDAP 1.x) do a reverse-resolution on the > server's IP address before attempting to bind, on both DNS _and_ NetBIOS > (remember I'm dealing with Windows machines here). Hmm, reverse lookups might make sense when using LDAP over SSL or LDAP with StartTLS to cross-check the server's name with the CN attribute in the subject DN of the server certificate. Since the OpenLDAP 1.x libs did not have any support for SSL/TLS this might be an issue with OpenLDAP 2.x libs. Just thoughts, not sure though... > Note that the I passed the LDAP server's address as a DNS name, and that > the IP address was correctly resolved by my DNS server. Can you please try to use the IP address directly and check if the same behaviour happens? > Anyhow, the > client always tries to find the NetBIOS name of the server machine, and > this was what caused the delay, since my LDAP server is behind a > firewall which is configured to disallow NetBIOS queries (the client > tries 3 times the query, then gives up). Once I let NetBIOS-ns through > (UDP port 137) the delay disappeared. > [..] > I can tell that it isn't Windows fault, at least: I'm not sure if that conclusion is right. 1. I remember reverse lookup problems with various software on Windows. (Therefore your observation is very interesting for other things too.) 2. I can't imagine why the OpenLDAP 2 libs should explicitly do reverse NETBIOS lookups other than using a default parameter somewhere which causes that. Now how's the behaviour on Windows with the normal OpenLDAP tools ldapsearch, etc.? Ciao, Michael. |
|
From: Mauro C. <mci...@si...> - 2002-06-24 08:01:33
|
I'm glad to announce that I've found the limiting factor for the 5-odd seconds delay during binds, which I was annoyed by, and that I was able to remove it. Therefore, it appears that the native Win32 (i.e., non-Cygwin) version of PythonLDAP 2.0.0pre04 (linked against OpenLDAP 2.0 libs) is indeed a workable replacement for PythonLDAP 1.x (under Windows, that is). If anyone's interested, here's what I found. By sniffing network traffic I saw that it wasn't LDAP's fault per se. In fact, for some reason the current libraries (as opposed to the old UMich libs that I used in PythonLDAP 1.x) do a reverse-resolution on the server's IP address before attempting to bind, on both DNS _and_ NetBIOS (remember I'm dealing with Windows machines here). Note that the I passed the LDAP server's address as a DNS name, and that the IP address was correctly resolved by my DNS server. Anyhow, the client always tries to find the NetBIOS name of the server machine, and this was what caused the delay, since my LDAP server is behind a firewall which is configured to disallow NetBIOS queries (the client tries 3 times the query, then gives up). Once I let NetBIOS-ns through (UDP port 137) the delay disappeared. If anyone has a clue to what is causing this rather bizarre behavior in the client, please let me know: I do think this is a misfeature, and I'd like to #ifdef it away at least in my version of the libs. However, I can't really tell who's doing this: it might be happening within SASL, or within OpenLDAP, or in any of some minor different components that need to be there when compiling. I can tell that it isn't Windows fault, at least: I'm sure of this because I compiled, on the same machine (WinXP Pro using MSVC++ 6.0sp4), both versions of PythonLDAP (old 1.x style linked with UMich libs, and new 2.x linked with OpenLDAP libs); and the former isn't trying reverse resolution, whereas the latter is. If anyone's interested, I can post the compiled Win32 binaries somewhere so that others may test them more thoroughly than myself. Thanks everybody Mauro |
|
From: <mci...@si...> - 2002-06-17 10:20:20
|
Michael Str=F6der =3Cmichael=40stroeder=2Ecom=3E writes=3A =3E Mauro Cicognini wrote=3A =3E =3E = =3E =3E However=2C I have seen that the searches don=27t show significant= = =3E =3E differences with the old 1=2Ex version=2E =3E = =3E Can you provide numbers=3F Hmmm=2C not really=2E It=27s just my feeling trying to do some simple = searches from the Python command line=3A I type the instruction=2C hit = return and =27=3E=3E=3E=27 comes up immediately (i=2Ee=2E no visibile del= ay)=2E I could do some timing=2C but I don=27t have the time now to set up the = environment now=2E=2E=2E sorry about this=2E I=27d be glad to supply sources and binaries to anyone interested=2C = though=2E =3E =3E The single call that takes a lot is the bind step=3A when I ask a= = =3E Python = =3E =3E script to bind it just seems to sit there and wait=2C more or les= s = =3E for 5 = =3E =3E seconds (strange coincidence)=2E =3E =3E = =3E =3E Could all this be related to the SASL libraries=3F =3E = =3E Hmm=2C no idea=2E I guess you are you talking about a simple bind=2E Yes=2C it=27s a simple bind=3A username and password=2E I guess it=27s also the only one supported by Win32 without certificates = or the whole Kerberos shebang=2E I really hope someone with SASL experience comes to our rescue=2E Mauro |
|
From: Anders K. <and...@ce...> - 2002-06-17 08:39:28
|
On Fri, 14 Jun 2002, Michael Str=F6der wrote: > 2. You might also have upgraded the LDAP server to OpenLDAP 2=20 > (which does stricter checking) or changed the indexing=20 > configuration (which is a performance penalty when writing)? Yes, it's linked against OpenLDAP 2. But this can't be the problem alone=20 since the bundled ldap-tools (ldapadd, ldapsearch..) works fast. Thanks a lot for your answers and ideas! I will try to look deeper into m= y=20 environment and see if I can find the bottlenecks. Regards, -------------------------------------------------------------------------= - Anders Karlsson Email: and...@ce... Cendio Systems AB WWW: www.cendio.se Teknikringen 3, Voice: +46 - (0)13 - 21 46 00 SE-583 30 LINK=D6PING, SWEDEN Fax: +46 - (0)13 - 21 47 00 |
|
From: <mi...@st...> - 2002-06-14 18:06:03
|
Michael Str=F6der wrote: > Anders Karlsson wrote: >=20 > You could try a very simple result() implementation (tweak=20 > Lib/ldap/ldapobject.py): >=20 > def result(self,msgid=3D_ldap.RES_ANY,all=3D1,timeout=3D-1): > return self._ldap_call(self._l.result,msgid,all,timeout) >=20 Just a warning: This is not meant as serious replacement. Just for=20 measuring the performance. Rest of LDAPObject class would have to=20 be tweaked too. Ciao, Michael. |
|
From: <mi...@st...> - 2002-06-14 18:05:54
|
Mauro Cicognini wrote: > > However, I have seen that the searches don't show significant > differences with the old 1.x version. Can you provide numbers? > The single call that takes a lot is the bind step: when I ask a Python > script to bind it just seems to sit there and wait, more or less for 5 > seconds (strange coincidence). > > Could all this be related to the SASL libraries? Hmm, no idea. I guess you are you talking about a simple bind. Ciao, Michael. |
|
From: Mauro C. <mci...@si...> - 2002-06-14 16:59:30
|
I have experienced significant delays, too, using the new version. Since I have recently managed to compile the whole stuff under Win32, I am doing some experimenting before releasing the binaries, and this is actually blocking me a bit. However, I have seen that the searches don't show significant differences with the old 1.x version. The single call that takes a lot is the bind step: when I ask a Python script to bind it just seems to sit there and wait, more or less for 5 seconds (strange coincidence). Could all this be related to the SASL libraries? Mauro |
|
From: <mi...@st...> - 2002-06-14 16:51:58
|
Anders Karlsson wrote:
>
> Since it was built for Python 1.x I upgraded to Python-LDAP 2.0.0pre04
> which worked fine, but considerable slower that the old version. A python
> program that changed some posts in the LDAP database and did two searches
> took about five seconds with 2.0.0pre04, compared to less than a second
> with 1.10alpha3.
>
> Why is there such big speed difference?
Hmm, there might be different issues here:
1. I changed the implementation of the synchronous methods *_s()
to avoid getting blocked in OpenLDAP 2 _s() functions. Especially
there's a reimplementation of result() in
ldap.ldapobject.LDAPObject which does the timeout handling itself.
The main reason is that the OpenLDAP libs are not thread-safe and
therefore a module-wide lock is used to serialize all calls into
the OpenLDAP libs.
(If someone manages to use OpenLDAP 2.1's libldap_r instead of
libldap this might help to reduce some of the locking.)
2. You might also have upgraded the LDAP server to OpenLDAP 2
(which does stricter checking) or changed the indexing
configuration (which is a performance penalty when writing)?
3. The OpenLDAP 2.x libs are slower. (python-ldap 1.x was linked
against OpenLDAP 1.x libs.)
Now for 1. issue:
You could try a very simple result() implementation (tweak
Lib/ldap/ldapobject.py):
def result(self,msgid=_ldap.RES_ANY,all=1,timeout=-1):
return self._ldap_call(self._l.result,msgid,all,timeout)
I did some testing with this method implementation for measuring
the overhead of my non-blocking version. I read the RootDSE of
Netscape DS 4.16SP1 running on the same box. (Solely searching the
RootDSE is an appropriate method to eliminate the influence of
database backends and such.)
On average it seems to be approx. 65% faster to use this simple
method above. This is something to consider.
Hmm, but you are experiencing a performance difference which is
much higher (5 times slower as I understand your posting). I'd
really appreciate if you could send more information about your
environment and *all* the changes you did. Performance measurement
numbers done under really the same conditions are also appreciated
off course.
Ciao, Michael.
|
|
From: Anders K. <and...@ce...> - 2002-06-10 08:04:36
|
I recently started to use Python-LDAP in a project. Since I first found the 1.10alpha3-version I used that. Everything worked fine and fast. Since it was built for Python 1.x I upgraded to Python-LDAP 2.0.0pre04 which worked fine, but considerable slower that the old version. A python program that changed some posts in the LDAP database and did two searches took about five seconds with 2.0.0pre04, compared to less than a second with 1.10alpha3. Why is there such big speed difference? Have I done something wrong or is= =20 it a known matter? Which version do you recomend me to use? Regards, -------------------------------------------------------------------------= - Anders Karlsson Email: and...@ce... Cendio Systems AB WWW: www.cendio.se Teknikringen 3, Voice: +46 - (0)13 - 21 46 00 SE-583 30 LINK=D6PING, SWEDEN Fax: +46 - (0)13 - 21 47 00 |
|
From: <mi...@st...> - 2002-06-06 00:24:19
|
Mauro Cicognini wrote: > > That's not the end of the story, though. I need some help with > distutils, and here's why. > [..] > However, to replicate this arrangement on other machines I'd have to > tell distutils that a) the binary distribution needs to include > LIBSASL.DLL, too; and b) when installing it, it needs to go somewhere on > the PATH. I have no idea how. I'd recommend to just make this a installation requirement. Exactly like the OpenLDAP and SASL libs have to be in place in a Unix environment *before* you build python-ldap. Ciao, Michael. |