From: Hubertus F. <fr...@us...> - 2001-02-09 20:28:18
|
John, regarding your message and my previous message. I am also looking into make the PROC_CHANGE_PENALTY a function of the pool. So within a cpu-pool, the PROC_CHANGE_PENALTY is set to "A" and to cpu's outside the pool the PROC_CHANGE_PENALTY will be set to "Y*A + B". CPU sets and different PROC_CHANGE_PENALTIES are extremely easy to code in our current MQ scheduler. Have you managed to get it running on a 32-way SGI machine. If so you might want to try the prototype of the cpu-pools we posted on the lse site. Hubertus Franke Enterprise Linux Group (Mgr), Linux Technology Center (Member Scalability) , OS-PIC (Chair) email: fr...@us... (w) 914-945-2003 (fax) 914-945-4425 TL: 862-2003 "John Hawkes" <ha...@en...>@lists.sourceforge.net on 02/09/2001 03:10:49 PM Sent by: lse...@li... To: "Mike Kravetz" <mkr...@se...>, <lse...@li...> cc: Subject: Re: [Lse-tech] cpus_allowed in multi-queue scheduler > Each CPU specific runqueue data structure has a > field which contains the maximum 'non-affinity goodness' > value of all schedulable tasks on that runqueue. Therefore, > when we 'take a quick look' we are really only looking at > the task with the maximum 'non-affinity goodness' on a remote > CPU's runqueue. Another wrinkle: specific hardware implementations may have gradations of "non-affinity goodness", rather than a binary presumption that a process that previously executed on cpuA will prefer to execute again on cpuA, but if not cpuA then any other cpu is equally less-good. Suppose we have a NUMA machine consisting of two nodes, and each node contains main memory, *two* CPUs (that we'll name cpuA and cpuB for node1, and cpuC and cpuD for node2), and perhaps even a shared L3 cache for that node's main memory. Suppose processX last executed on cpuA. If it re-executes on cpuA, then we have some potential for having L1 and L2 cacheblocks waiting for it and we expect optimum performance. And if it re-executes on cpuB, then we have some potential for having L3 cacheblocks waiting, and our performance is almost as good as cpuA. Re-executing on another node -- on cpuC or cpuD -- would be definitely inferior to cpuB. Moreover, a NUMA machine that has a hierarchy of memory access penalties, depending upon how "far away" you are from the previous-execution node's memory, will have an even more complex "goodness" calculation. Thus, what we need is to abstract the "goodness" calculation to allow for architecture-specific differences. John Hawkes ha...@en... _______________________________________________ Lse-tech mailing list Lse...@li... http://lists.sourceforge.net/lists/listinfo/lse-tech |