RE: [PATCH 09/10] percpu: implement new dynamic percpu allocator

From: Luck, Tony
Date: Tue Feb 24 2009 - 12:41:57 EST


> > IA64 uses a pinned TLB entry to map this cpu's 64k at __phys_per_cpu_start.
> > See __ia64_per_cpu_var() in arch/ia64/include/asm/percpu.h. This means they
> > can also optimize cpu_local_* and read_cpuvar (or whatever it's called now).
> > IIUC IA64 needs this region internally, using it for percpu vars is a bonus.

Something like that ...

ia64 started out with a pinned TLB entry to map the percpu space to the
top 64K of address space (so that the compiler can generate ld/st instructions
with a small negative offset from register r0 to access local-to-this-cpu
objects).

Then we started using a one of the ar.k* registers to hold the base
physical address for each cpus per-cpu area so that early parts of
machine check code (which runs with MMU off) can access per-cpu variables.

Finally we found that certain transaction processing benchmarks ran faster
if we let the cpu have free access to one extra TLB entry ... so we
stopped pinning the per-cpu area, and wrote a s/w fault handler to
insert the mapping on demand (using the ar.k3 register to get the
physical address for the mapping).

N.B. ar.k3 is a medium-slow register ... I wouldn't want to use it
in the code sequence for *every* per-cpu variable access.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/