Paul Dorwin wrote:
> I am currently working on this issue. User-space access
> is being developed. The user space access will be via driverfs. I am
> currently working on a patch which places NUMA nodes and CPUs in
> driverfs. Each node also exports a bitmap of CPUs on the node. I plan
> to put it out as an [RFC] soon. I would be interested to know what
> other information would be useful in userspace. I will also be
> working on a C API for ease of programming.
I trust that the C API will be for use by kernel services, not
applications. Is that correct? I see no use for applications
to manipulate what amounts to an image of the system topology.
Maybe access it, but not modify it. A simple get, get-next,
get-prev API would do the trick.
> I also plan to implement kernel access. However, I have not yet
> started on this. I'll coordinate with Mike Kravetz's NUMA scheduler
> enhancements. I'm thinking that what is needed are
> internal structures to represent CPUs/nodes and distances.
Yes, as long as the view includes processors, memory blocks,
I/O buses and whatever else may be node-attached.
> For distances, I was thinking of a per-cpu array of distances to
> each other CPU. Then a modified numa_sched_init() which populates
> your cpu_sets in an arch independent manner.
You will have an awful lot of arrays hanging around, what with
arrays that, for each CPU, describes distances to all other CPUs,
all memory blocks, all I/O buses. Since the topology is node-
oriented, maybe a better idea would be to make the arrays be
node-based and describe distance/latency/whatever between all
nodes, then populate cpu_sets (and, eventually, like constructs
for memory and I/O) based upon those much smaller arrays. An
exception would be hyperthreading, but that is easy to handle
as long as we remember that some future processor may have more
In performance tuning you should know when to stop.
In systems design you must know when not to...
- Plato (maybe)