|
From: Kristian V. <kri...@xe...> - 2005-01-11 16:46:58
Attachments:
sched.diff
|
I've played around with the scheduler today, to see if I could find a solution to the thread-starvation problem. I think I may have a workable solution, which although it is not a complete rewrite as an O(1) algorithm, it does fix the problem. I've modified the scheduler to use "dyanmic" priorities for each thread. The dynamic priority is calculated for each thread every time the thread execution is completed (E.g. the thread is pre-empted). The algorithm simply works out the amount of CPU time being used as a percentage, based on the time the thread was created (It's total life-time) and the total CPU time it has used so far. If the percentage is deemed to be too high, it's dynamic priority is lowered, using a sliding scale. Well behaved threads which use less than 50% CPU time are re-issued with their "static" priority, which is the priority given to the thread at creation. The two scheduler functions which are involved in thread selection, add_thread_to_ready() and select_thread(), use the dynamic priority instead of the static priority when they are handling threads. In this way CPU-bound threads which do not block can not longer choke of lower priority threads, which may be bloking often. The dynamic priority is scaled back until the thread releases CPU time for other threads, and it slips back in the list of runable threads. In addition, the quantum of every thread is calculated, based on the threads static priority. Higher priority threads get a longer quantum. Essentially this allows heavily CPU bound tasks to still get more CPU time, but it is no longer at the total expense of the other threads. An experimental kernel can be downloaded from http://www.liqwyd.com/syllable/kernel.zip Unzip it and copy it to /system/ You might want to back up your original (known good) kernel first! I have also written a very small test program which demonstrates the problem on the old kernel, and the bevahour of the new kernel. It is available from http://www.liqwyd.com/syllable/sched_test.zip The source and binary are included. The binary takes one optional argument, "-s" With no arguments the program creates two threads; one to print info and the second which sits in a very tight loop, eating your CPU. The tight-loop thread is created at a priority of 150 E.g realtime. If you pass -s to the test binary, the thread will sleep for one second per iteration. If you run the test on an existing kernel, your machine will essentially become non-responsive. This is because the CPU bound thread gets 100% of the CPU at a higher priority than the appserver. If you run the test on the patched kernel, the thread will be dynamically scalled back. It will still claim as much CPU as it can, but it can no longer starve other threads. If you run the test on the patched kernel with the -s option, you will see that the dynamic priority is never altered as the thread is "well behaved" The kernel patch is attached. I would appreciate feedback and especially any improvements. I do not intend to apply these patches to the mainline kernel without plenty of testing, and preferably discussion. Note that the experimental kernel is the stock CVS kernel with the scheduler patches, but none of Arno's recent patches. No doubt we'll merge our patches soon. -- Vanders http://www.syllable.org http://www.liqwyd.com _____________________________________________________________________ This message has been checked for all known viruses by Xenicom delivered through the MessageLabs Virus Control Centre. |
|
From: Daniel G. <da...@fp...> - 2005-01-11 17:31:34
|
On Tue, 2005-01-11 at 16:46 +0000, Kristian Vandervliet wrote: > I've played around with the scheduler today, to see if I could > find a solution to the thread-starvation problem. I think I may > have a workable solution, which although it is not a complete > rewrite as an O(1) algorithm, it does fix the problem. I'll look at it in more detail a bit later, but in the meantime, could I suggest unified diffs (diff -u)? They're *much* easier to read as a diff than context diffs are... Daniel |
|
From: Daniel G. <da...@fp...> - 2005-01-11 17:59:22
Attachments:
sched-dfg.diff
|
Overall, this looks quite good. Nice and simple, and straight forward.
It's quite likely that the actual dynamic priority calculation and
quantum sizes could be tweaked for better fairness/whatever, but that's
easy to do. Maybe a macro/function to calculate quantum, so it's easy
to change globally?
One note: in the dynamic priority calculation, you check that the
reduction won't leave the priority space. I suspect what you want is
this:
if( nRunTotal > 50 && nRunTotal <= 70 )
{
psThread->tr_nDynamicPriority -= 15;
}
else if( nRunTotal > 70 && nRunTotal <= 90 )
{
psThread->tr_nDynamicPriority -= 35;
}
else if( nRunTotal > 90 )
{
psThread->tr_nDynamicPriority -= 50;
}
if (psThread->tr_nDynamicPriority <= -100)
{
psThread->tr_nDynamicPriority = -99;
}
Otherwise, nothing will ever get below 85, and more well-behaved
processes will get penalized worse than badly behaved processes (they'll
get to -85, but the bad ones will only get to -50)
Attached is my few suggestions for changes. It's against current CVS,
so you won't be able to apply it on top of your changes. Basically, it
makes the above change, adds a macro for GET_QUANTUM, and adds
documentation for recalculate_dynamic_priority().
Daniel
On Tue, 2005-01-11 at 16:46 +0000, Kristian Vandervliet wrote:
> I've played around with the scheduler today, to see if I could
> find a solution to the thread-starvation problem. I think I may
> have a workable solution, which although it is not a complete
> rewrite as an O(1) algorithm, it does fix the problem.
>
> I've modified the scheduler to use "dyanmic" priorities for each
> thread. The dynamic priority is calculated for each thread
> every time the thread execution is completed (E.g. the thread is
> pre-empted). The algorithm simply works out the amount of CPU
> time being used as a percentage, based on the time the thread
> was created (It's total life-time) and the total CPU time it has
> used so far. If the percentage is deemed to be too high, it's
> dynamic priority is lowered, using a sliding scale. Well behaved
> threads which use less than 50% CPU time are re-issued with their
> "static" priority, which is the priority given to the thread at
> creation.
>
> The two scheduler functions which are involved in thread selection,
> add_thread_to_ready() and select_thread(), use the dynamic priority
> instead of the static priority when they are handling threads.
>
> In this way CPU-bound threads which do not block can not longer
> choke of lower priority threads, which may be bloking often. The
> dynamic priority is scaled back until the thread releases CPU time
> for other threads, and it slips back in the list of runable threads.
>
> In addition, the quantum of every thread is calculated, based on
> the threads static priority. Higher priority threads get a longer
> quantum. Essentially this allows heavily CPU bound tasks to still
> get more CPU time, but it is no longer at the total expense of the
> other threads.
>
> An experimental kernel can be downloaded from
> http://www.liqwyd.com/syllable/kernel.zip Unzip it and copy it to
> /system/ You might want to back up your original (known good)
> kernel first!
>
> I have also written a very small test program which demonstrates
> the problem on the old kernel, and the bevahour of the new kernel.
> It is available from http://www.liqwyd.com/syllable/sched_test.zip
> The source and binary are included. The binary takes one optional
> argument, "-s" With no arguments the program creates two threads;
> one to print info and the second which sits in a very tight loop,
> eating your CPU. The tight-loop thread is created at a priority of
> 150 E.g realtime. If you pass -s to the test binary, the thread
> will sleep for one second per iteration.
>
> If you run the test on an existing kernel, your machine will
> essentially become non-responsive. This is because the CPU bound
> thread gets 100% of the CPU at a higher priority than the appserver.
> If you run the test on the patched kernel, the thread will be
> dynamically scalled back. It will still claim as much CPU as it
> can, but it can no longer starve other threads. If you run the test
> on the patched kernel with the -s option, you will see that the
> dynamic priority is never altered as the thread is "well behaved"
>
> The kernel patch is attached. I would appreciate feedback and
> especially any improvements. I do not intend to apply these patches
> to the mainline kernel without plenty of testing, and preferably
> discussion. Note that the experimental kernel is the stock CVS
> kernel with the scheduler patches, but none of Arno's recent patches.
> No doubt we'll merge our patches soon.
>
> --
> Vanders
> http://www.syllable.org
> http://www.liqwyd.com
>
>
>
> _____________________________________________________________________
> This message has been checked for all known viruses by Xenicom delivered through the MessageLabs Virus Control Centre.
|
|
From: Kristian V. D. V. <va...@li...> - 2005-01-11 21:31:26
|
On Tuesday 11 January 2005 17:59, Daniel Gryniewicz wrote: > Overall, this looks quite good. Nice and simple, and straight forward. > It's quite likely that the actual dynamic priority calculation and > quantum sizes could be tweaked for better fairness/whatever, but that's > easy to do. Maybe a macro/function to calculate quantum, so it's easy > to change globally? > > One note: in the dynamic priority calculation, you check that the > reduction won't leave the priority space. I suspect what you want is > this: > > <snip> > > Otherwise, nothing will ever get below 85, and more well-behaved > processes will get penalized worse than badly behaved processes (they'll > get to -85, but the bad ones will only get to -50) Sounds like a good idea to me. > Attached is my few suggestions for changes. It's against current CVS, > so you won't be able to apply it on top of your changes. Basically, it > makes the above change, adds a macro for GET_QUANTUM, and adds > documentation for recalculate_dynamic_priority(). Thanks. One change I've considered is rather than trying to calculate the CPU time over the total life of the thread is to keep a count of the number of times a thread runs to the end of it's quantum. If the quantum expires the counter increments by one. If the thread blocks before the quantum expires the counter is decremented by one. The penalty would then be applied according to the counter value. Threads with a count of five would be scaled back by -10, a count of ten by -20, fifteen by -30 etc. (The real numbers would probably require some tuning) The advantage here is that the calculation is much easier. It also solves the problem where threads which have spent a long time idle suddenly shoot to 100% CPU usage. Under the current calculation based on the total life of the thread the CPU usage would still be calculated as a low percentage and no penalty imposed. Under the new scheme a thread at 100% CPU would quickly acumulate a high count, and be penalized. I'll apply your patches tommorow and I'll try out my counter scheme and let you all know how that works out, too. -- Vanders http://www.syllable.org http://www.liqwyd.com |
|
From: Daniel G. <da...@fp...> - 2005-01-12 00:49:20
|
On Tue, 2005-01-11 at 21:32 +0000, Kristian Van Der Vliet wrote: > > One change I've considered is rather than trying to calculate the CPU time > over the total life of the thread is to keep a count of the number of times a > thread runs to the end of it's quantum. If the quantum expires the counter > increments by one. If the thread blocks before the quantum expires the > counter is decremented by one. The penalty would then be applied according > to the counter value. Threads with a count of five would be scaled back by > -10, a count of ten by -20, fifteen by -30 etc. (The real numbers would > probably require some tuning) The advantage here is that the calculation is > much easier. It also solves the problem where threads which have spent a > long time idle suddenly shoot to 100% CPU usage. Under the current > calculation based on the total life of the thread the CPU usage would still > be calculated as a low percentage and no penalty imposed. Under the new > scheme a thread at 100% CPU would quickly acumulate a high count, and be > penalized. Yes, that sounds generally better. Another possibility is a decaying average, which ends up being very similar to what you want to do. In general, the current method you used will help in the normal case of a run-away thread. It's much more unlikely, in my opinion, for a thread to sleep for a long time and then go berzerk. :) Still, it'll likely be an improvement. I'll try and get the kernel built on my box here and get some ideas and testing in. > > I'll apply your patches tommorow and I'll try out my counter scheme and let > you all know how that works out, too. > Thanks. Daniel |
|
From: Anthony J. <ant...@bt...> - 2005-01-12 20:47:53
|
> One change I've considered is rather than trying to calculate the CPU time > over the total life of the thread is to keep a count of the number of > times a > thread runs to the end of it's quantum. (knowing you've now attempted this, and had an issue, so maybe considering leaving that sorta idea)... Recently at work, under Windows XP I've been suffering from a process that 99.9% of the time is sleeping, but about twice a day it kicks in and uses 100% CPU for about 2 mins, and then goes back to sleep. Unfortunately it starves most of the other processes, making my machine practically unusable while its happening. Note that the process is constantly running - its lifespan is since the last time the machine was rebooted. With a prioritisation based off time since the process/thread started a process like this would probably tend to starve things 'cos over the whole lifespan of the thread it would appear well behaved, even though it sometimes isnt. I suspect a strategy similar to the one above, or some decaying type algorithm would handle this scenario much better. Just some food for thought Anthony |
|
From: Jarret R <ja...@le...> - 2005-01-13 03:51:57
|
Anthony Jacques wrote: > Recently at work, under Windows XP I've been suffering from a process > that 99.9% of the time is sleeping, but about twice a day it kicks in > and uses 100% CPU for about 2 mins, and then goes back to sleep. > Unfortunately it starves most of the other processes, making my machine > practically unusable while its happening. Note that the process is > constantly running - its lifespan is since the last time the machine was > rebooted. As a side note, this is probably a virus or spyware. Download ad-aware, spybot and avg to get rid of it :). Unless you know what the process is, of course. Jarret |
|
From: Anthony J. <ant...@bt...> - 2005-01-13 08:51:26
|
> As a side note, this is probably a virus or spyware. Download ad-aware, > spybot and avg to get rid of it :). Unless you know what the process is, > of course. Cheers, although I am pretty sure the process is our group policies being reapplied - the process is services.exe, and checking the system event log shows a log entry about a service reapplying the group policies at exactly the times it locks up. Prior to our admins fixing a configuration error relating to group policies it happened much more frequently, and logged an error - at least it believes it succeeds at the moment (I'm sure theres probably still something wrong though). Anthony |
|
From: Kristian V. D. V. <va...@li...> - 2005-01-11 18:54:53
|
On Tuesday 11 January 2005 17:31, Daniel Gryniewicz wrote: > On Tue, 2005-01-11 at 16:46 +0000, Kristian Vandervliet wrote: > > I've played around with the scheduler today, to see if I could > > find a solution to the thread-starvation problem. I think I may > > have a workable solution, which although it is not a complete > > rewrite as an O(1) algorithm, it does fix the problem. > > I'll look at it in more detail a bit later, but in the meantime, could I > suggest unified diffs (diff -u)? They're *much* easier to read as a > diff than context diffs are... Sorry about that. The diff was just a quick job to give everyone else something to check over. -- Vanders http://www.syllable.org http://www.liqwyd.com |
|
From: Daniel G. <da...@fp...> - 2005-01-11 18:59:41
|
On Tue, 2005-01-11 at 18:55 +0000, Kristian Van Der Vliet wrote: > On Tuesday 11 January 2005 17:31, Daniel Gryniewicz wrote: > > On Tue, 2005-01-11 at 16:46 +0000, Kristian Vandervliet wrote: > > > I've played around with the scheduler today, to see if I could > > > find a solution to the thread-starvation problem. I think I may > > > have a workable solution, which although it is not a complete > > > rewrite as an O(1) algorithm, it does fix the problem. > > > > I'll look at it in more detail a bit later, but in the meantime, could I > > suggest unified diffs (diff -u)? They're *much* easier to read as a > > diff than context diffs are... > > Sorry about that. The diff was just a quick job to give everyone else > something to check over. > That's okay. I just applied it to my cvs checkout and re-diffed. :) Daniel |
|
From: Kristian V. D. V. <va...@li...> - 2005-01-12 19:25:45
|
On Wednesday 12 January 2005 00:49, Daniel Gryniewicz wrote: > On Tue, 2005-01-11 at 21:32 +0000, Kristian Van Der Vliet wrote: >> One change I've considered is rather than trying to calculate the CPU >> time over the total life of the thread is to keep a count of the number >> of times a thread runs to the end of it's quantum. If the quantum >> expires the counter increments by one. If the thread blocks before the >> quantum expires the counter is decremented by one. The penalty would >> then be applied according to the counter value. Threads with a count of >> five would be scaled back by -10, a count of ten by -20, fifteen by -30 >> etc. (The real numbers would probably require some tuning) The advantage >> here is that the calculation is much easier. It also solves the problem >> where threads which have spent a long time idle suddenly shoot to 100% >> CPU usage. Under the current calculation based on the total life of the >> thread the CPU usage would still be calculated as a low percentage and no >> penalty imposed. Under the new scheme a thread at 100% CPU would quickly >> acumulate a high count, and be penalized. > > Yes, that sounds generally better. Another possibility is a decaying > average, which ends up being very similar to what you want to do. > > In general, the current method you used will help in the normal case of > a run-away thread. It's much more unlikely, in my opinion, for a thread > to sleep for a long time and then go berzerk. :) > > Still, it'll likely be an improvement. I'll try and get the kernel > built on my box here and get some ideas and testing in. This scheme turned out to be tricky to implement, with no real benefit over the current scheme. It suffered from a "ping pong" problem where the dynamic priority would ping-pong between two values as the algorithm fought over itself. Certain threads were also very suseptable to what appeared to be priority inversion E.g. the desktop thread at login would use a large amount of CPU time, get down to a certain low priority and then hang, likely deadlocked with another thread. The testcase showed less of an improvement also, so the net benefit was smaller. I didn't get more testing done as I really need to rebuild my machine; it has remnants of Glibc 2.3.3 testing on it which are playing havoc. Once I've got it back to a clean state I'll get back to fiddling with some different ideas. -- Vanders http://www.syllable.org http://www.liqwyd.com |
|
From: Daniel G. <da...@fp...> - 2005-01-12 22:54:25
|
On Wed, 2005-01-12 at 19:26 +0000, Kristian Van Der Vliet wrote: > On Wednesday 12 January 2005 00:49, Daniel Gryniewicz wrote: > > On Tue, 2005-01-11 at 21:32 +0000, Kristian Van Der Vliet wrote: > >> One change I've considered is rather than trying to calculate the CPU > >> time over the total life of the thread is to keep a count of the number > >> of times a thread runs to the end of it's quantum. If the quantum > >> expires the counter increments by one. If the thread blocks before the > >> quantum expires the counter is decremented by one. The penalty would > >> then be applied according to the counter value. Threads with a count of > >> five would be scaled back by -10, a count of ten by -20, fifteen by -30 > >> etc. (The real numbers would probably require some tuning) The advantage > >> here is that the calculation is much easier. It also solves the problem > >> where threads which have spent a long time idle suddenly shoot to 100% > >> CPU usage. Under the current calculation based on the total life of the > >> thread the CPU usage would still be calculated as a low percentage and no > >> penalty imposed. Under the new scheme a thread at 100% CPU would quickly > >> acumulate a high count, and be penalized. > > > > Yes, that sounds generally better. Another possibility is a decaying > > average, which ends up being very similar to what you want to do. > > > > In general, the current method you used will help in the normal case of > > a run-away thread. It's much more unlikely, in my opinion, for a thread > > to sleep for a long time and then go berzerk. :) > > > > Still, it'll likely be an improvement. I'll try and get the kernel > > built on my box here and get some ideas and testing in. > > This scheme turned out to be tricky to implement, with no real benefit over > the current scheme. It suffered from a "ping pong" problem where the dynamic > priority would ping-pong between two values as the algorithm fought over > itself. Certain threads were also very suseptable to what appeared to be > priority inversion E.g. the desktop thread at login would use a large amount > of CPU time, get down to a certain low priority and then hang, likely > deadlocked with another thread. The testcase showed less of an improvement > also, so the net benefit was smaller. > > I didn't get more testing done as I really need to rebuild my machine; it has > remnants of Glibc 2.3.3 testing on it which are playing havoc. Once I've got > it back to a clean state I'll get back to fiddling with some different ideas. > Okay, I've knocked up an attempt at this too, but not yet tested it. (It takes a bloody long time to checkout cvs on Syllable... I guess we need to do some serious speedups on AFS.) I'll play around with it a bit when I get home, and see if I have problems too. Daniel |
|
From: Daniel G. <da...@fp...> - 2005-01-13 03:44:35
|
On Wed, 2005-01-12 at 19:26 +0000, Kristian Van Der Vliet wrote: > This scheme turned out to be tricky to implement, with no real benefit over > the current scheme. It suffered from a "ping pong" problem where the dynamic > priority would ping-pong between two values as the algorithm fought over > itself. Certain threads were also very suseptable to what appeared to be > priority inversion E.g. the desktop thread at login would use a large amount > of CPU time, get down to a certain low priority and then hang, likely > deadlocked with another thread. The testcase showed less of an improvement > also, so the net benefit was smaller. > > I didn't get more testing done as I really need to rebuild my machine; it has > remnants of Glibc 2.3.3 testing on it which are playing havoc. Once I've got > it back to a clean state I'll get back to fiddling with some different ideas. > So, I fared a bit better than you did. I could log in, and use by box normally, as far as I could tell. However, if I tried to run a sync command, the box locked up, with nothing being printed in the kernel log. This was repeatable: sync -> lockup. Of course, neither the original kernel nor your hacked up one caused such a lockup. Back to the drawing board, I guess. (If anyone's interested, I can post my patches...) On a side note, has anyone gotten sshd working? It would make development, if not testing, easier. Daniel |