>>>>> On Wed, 26 Dec 2001 16:26:10 -0800 (PST), Kanoj Sarcar <kanojsarcar@...> said:
Kanoj> Just a quick comment ... if what you mean by fast per cpu
Kanoj> data is having a percpu data structure mapped to the same
Kanoj> virtual address but different physical addresses in different
Kanoj> cpus, then I would like to point out that implementing this
Kanoj> would mean eating up an entire tlb entry just for this
Kanoj> mapping, thus reducing tlb reach and possibly increasing tlb
Kanoj> pollution. Thus, this should be done so that each
Kanoj> architecture can choose to go with the current method
Kanoj> (indexing via smp_processor_id()), if they think the benefit
Kanoj> of this approach is minimal.
What I'm looking for is an efficient implementation of the ia64
equivalent of local_cpudata. On ia64, we implement this as:
#define local_cpu_data ((struct cpuinfo_ia64 *) PERCPU_ADDR)
where PERCPU_ADDR happens to be an address that is pinned in the TLB.
However, other implementations are possible (e.g., special casing in
software TLB miss handler, special casing the first level in the page
table tree, etc.) and the optimal implementation is definitely
architecture specific. I'm not sure what the best solution for x86
is, but Linus doesn't want this:
#define local_cpu_data (&cpu_data[smp_processor_id()])
because he thinks that the overhead of calling smp_processor_id() once
per use of local_cpu_data is too much (personally, I doubt the extra
overhead is measurable given how register-constrained x86 is, but
that's admittedly just an educated guess).
--david
|