From: Matthew D. <col...@us...> - 2004-10-12 21:26:52
|
On Tue, 2004-10-12 at 01:50, Simon Derr wrote: > > One of the cool thing about using sched_domains as your partitioning > > element is that in reality, tasks run on *CPUs*, not *domains*. So i= f > > you have threads 'a1' & 'a2' running on CPUs 0 & 1 (small job 'a') an= d > > threads 'b1' & 'b2' running on CPUs 2 & 3 (small job 'b'), you can > > suspend threads a1, a2, b1 & b2 and remove the domains they were runn= ing > > in to allow job A (big job with threads A1, A2, A3, & A4) to run on t= he > > larger 4 CPU domain. When you then suspend A1-A4 again to allow the > > smaller jobs to proceed, you can pretty trivially create the 2 CPU > > domains underneath the 4 CPU domain and resume the jobs. Those jobs = (a > > & b) have been suspended on the CPUs they were originally running on, > > and thus will resume on the same CPUs without any extra effort. They > > will simply run on those CPUs, and at load balance time, the domains > > attached to those CPUs will be consulted to determine where the tasks > > can be relocated to if there is a heavy load. The domains will tell = the > > scheduler that the tasks cannot be relocated outside the 2 CPUs in ea= ch > > respective domain. Viola! (sorta ;) > Voil=C3=A0 ;-) hehe... My French spelling obviously isn't quite up to par. ;) > I agree that this looks really smooth from a scheduler point of view. >=20 > From a user point of view, remains the issue of suspending the tasks: > -find which tasks to suspend : how do you know that job 'a' consists=20 > exactly of 'a1' and 'a2' > -suspend them (btw, how do you achieve this ? kill -STOP ?) >=20 >=20 > I've been away from my mail and still trying to catch up, nevermind if = the=20 > above does not makes sense to you. >=20 > Simon. Paul didn't go into specifics about how to suspend the job, so neither did I. Sending SIGSTOP & SIGCONT should work, as you mention... Those are implementation details which really aren't *that* important to the discussion. We're still trying to figure out the overall framework and API to work with, so which method of suspending a thread we'll eventually use can be tackled down the road. :) -Matt |