It seems to be to be a classic tradeoff. On one hand you have a best case
that is most efficient on the hardware (#processes == #processors) and
on the other hand you have some extra "stuff" you need to do to handle
the case the hardware is being abused with reasonable degradataion.
Unfortunately, most benchmarks tend to show the best case and are highly touted.
The real world tends to see the hardware in an abused state as often as
the best case happens.
So... Is there some way to "measure" degradataion? This is a little different
slant on scaling. Any good benchmarks that highlights performance
gracefully degrading (rather than a sharp performance drop off)?
On Wed, Jun 20, 2001 at 10:40:51AM -0700, Gerrit Huizenga wrote:
> > Any comments anybody ? Particularly on
> > >A common request is to modify the Linux scheduler to better handle
> > large numbers of running processes/threads. This is always rejected by
> > >the kernel developer community because it is, frankly, stupid to have
> > large numbers of threads.
> The core (and obviously correct) part of this argument is that a machine
> has a mostly fixed number of CPU cycles to offer, and a single thread per
> CPU could be created to use all those cycles. Or, at the very least, if
> you have to use multiple threads of control (e.g. tasks/processes/threads)
> the total run queue length over the life of the system has average <= 1
> per CPU. A machine with a sustained run queue length of more than 1 per
> CPU is a sign that there isn't enough CPU on the machine to complete your
> Great. But what this doesn't account for is bursty traffic that occurs
> in many real world situations. Long ago I used to work at a University
> with single and dual processor Vaxen. Over a 24 hour period, the sustained
> load average was about 0.2 on most day,s and about 0.8 to 1.0 near the end
> of the semester. But every hour, nearly on the hour, the load average
> spiked for 5 minutes to about 40. Turned out that students would get out
> of class, go to the computer, and log in simultanously. Well, we *wanted*
> them all to be able to log in, and once they got going, the arrival rate
> of new processes was sufficiently random that the load average returned to
> reasonable values.
> Same is true in most web serving and database applications. Look at the
> number of people that post about getting /.'ed and have their machine go
> down or become unreasonably slow. The Linux servers are suffering from a
> bad scheduler at that point. The load average spikes because there is
> a process or task or thread per context created in some cases. Or in
> the database application, when a new shipment of bolts is entered into
> the database, everyone that needs new bolts is notified or they hear about
> it and request their allocations. A handy way of organizing the database
> is one context thread per user. Sure, this could be done as a finite
> state machine in a single thread, but if you want to do asynchronous I/O,
> maintain access rights, remember context for a user, etc., a thread is
> a really cool working model.
> Anyway, I think the thing that the scheduling neophytes keep
> overlooking is the real-world arrival rate of tasks/processes/work.
> The arrival rate on a desktop is pretty constant (how many mouse clicks
> can you make a second?) but on a server, the arrival rate is somewhat
> random with bursts, and the server should be able to handle that
> arrival rate smoothly, rather than rapidly degrading. And the evidence
> of these problems is easy to find out there (being /.'d being a great
> example) if they'd care to look.
> Lse-tech mailing list
John Wright (john@...)