Re: [Fwd: Re: [Cpu-users] Slow user creation]
Brought to you by:
matheny
From: Blake M. <bma...@pu...> - 2003-09-27 01:45:17
|
Here is the official report, now that I have the bugs worked out. Expect this feature for the next release. Configuration file was mostly default, except that USERGROUPS is set to YES so for each useradd the is a groupadd. For this who don't want to read on, here is the summary: It is always a bad idea to set GRAB_ALL_UIDS/GRAB_ALL_GIDS to false and RANDOM to false, so I'm going to make the change that GRAB_ALL_UIDS/GRAB_ALL_GIDS is true by default. It can be faster to set GRAB_ALL_UIDS/GRAB_ALL_GIDS to false and RANDOM to true, but most people like some type of linear ordering. Note that in some cases setting GRAB_ALL* to true was nearly 7 times faster than setting it to false with RANDOM set to false. ------------------------------------------------------ First set of tests. Empty directory, adding 300 users. ------------------------------------------------------ GRAB_ALL_UIDS and GRAB_ALL_GIDS set to true. [crimson 70] tests > time ./useradd.pl 0 299 ... real 2m34.507s user 0m1.560s sys 0m1.100s GRAB_ALL_UIDS and GRAB_ALL_GIDS set to false and RANDOM set to false. [crimson 71] tests > time ./useradd.pl 0 299 ... real 3m55.956s user 0m8.850s sys 0m3.740s GRAB_ALL_UIDS and GRAB_ALL_GIDS set to false and RANDOM set to true. [crimson 71] tests > time ./useradd.pl 0 299 ... real 2m13.783s user 0m1.870s sys 0m0.850s ----------------------------------------------------------------------------- Second set of tests. Directory contains 300 linear users (using uid/gids from 100-400), adding 300 users. ----------------------------------------------------------------------------- GRAB_ALL_UIDS and GRAB_ALL_GIDS set to true. [crimson 72] tests > time ./useradd.pl 299 599 ... real 2m49.624s user 0m3.580s sys 0m1.780s GRAB_ALL_UIDS and GRAB_ALL_GIDS set to false and RANDOM set to false. [crimson 73] tests > time ./useradd.pl 299 599 ... real 14m45.240s user 0m26.620s sys 0m11.330s GRAB_ALL_UIDS and GRAB_ALL_GIDS set to false and RANDOM set to true. [crimson 71] tests > time ./useradd.pl 299 599 ... real 2m33.843s user 0m1.780s sys 0m0.950s ----------------------------------------------------------------------------- Third set of tests. Directory contains 300 random users. The uid/gid space is 600 and half that is used, adding 300 users. ----------------------------------------------------------------------------- GRAB_ALL_UIDS and GRAB_ALL_GIDS set to true. [crimson 72] tests > time ./useradd.pl 299 599 ... real 2m50.964s user 0m3.570s sys 0m1.750s GRAB_ALL_UIDS and GRAB_ALL_GIDS set to false and RANDOM set to false. [crimson 73] tests > time ./useradd.pl 299 599 ... real 10m13.949s user 0m18.030s sys 0m8.410s GRAB_ALL_UIDS and GRAB_ALL_GIDS set to false and RANDOM set to true. [crimson 71] tests > time ./useradd.pl 299 599 ... real 2m48.698s user 0m2.290s sys 0m1.250s -Blake Whatchu talkin' 'bout, Willis? > Blake Matheny wrote: > > >Well, there are a couple of issues here to consider. If there is a large > >userbase, grabbing all of the IDs could be terribly slow. > > > > I guess part of the problem is the sparseness of the ldap system for > complex optimized queries. no select max(uid) from users :) > > I would think that even in this worst case getting a single large query > should be an order of magnitude faster than several hundred individual > queries and would scale a lot better. Right now the real speed problem > I think is not the sorting it is the constant queries to the directory. > Even a linear search of an unordered list that is grabbed from the > directory in one query would be a lot faster than the current setup I think. > > Instead of a binary tree you could set up a linked list and do a > quicksort. Not having a complex data structure should be ok since we are > not "searching" the list, rather we are sorting it and grabbing the max > value. This does not find holes in the list of UIDs but it does give you > constantly incrementing UID and GID's. Of course if you want to find > holes a binary tree insertion plus search for the lowest available gap > would work too, perhaps that is a more general solution. > > In the interim I guess I will use the following technique, either > manually or in a script which accomplishes the same thing externally to cpu. > > # cpu -w cat| awk -F : {'print $3'} | sort | uniq|tail -n 1 > 5545 > # cpu -w useradd -u 5546 -g 5546 -ptest test3 > > btw Blake thank you for cpu, it is exactly the kind of tool that is > needed to interface well with the use of LDAP in authentication. There > are very few similar tools that I have found and none that work nearly > as well. > > Thanks, > > Terrence > -- Blake Matheny "... one of the main causes of the fall of the bma...@pu... Roman Empire was that, lacking zero, they had http://www.mkfifo.net no way to indicate successful termination of http://ovmj.org/GNUnet/ their C programs." --Robert Firth |