Re: [RFC 00/15] x86_64: Optimize percpu accesses

From: Jeremy Fitzhardinge
Date: Wed Jul 09 2008 - 13:27:44 EST


Mike Travis wrote:
This patchset provides the following:

* Cleanup: Fix early references to cpumask_of_cpu(0)

Provides an early cpumask_of_cpu(0) usable before the cpumask_of_cpu_map
is allocated and initialized.

* Generic: Percpu infrastructure to rebase the per cpu area to zero

This provides for the capability of accessing the percpu variables
using a local register instead of having to go through a table
on node 0 to find the cpu-specific offsets. It also would allow
atomic operations on percpu variables to reduce required locking.
Uses a new config var HAVE_ZERO_BASED_PER_CPU to indicate to the
generic code that the arch has this new basing.

(Note: split into two patches, one to rebase percpu variables at 0,
and the second to actually use %gs as the base for percpu variables.)

* x86_64: Fold pda into per cpu area

Declare the pda as a per cpu variable. This will move the pda
area to an address accessible by the x86_64 per cpu macros.
Subtraction of __per_cpu_start will make the offset based from
the beginning of the per cpu area. Since %gs is pointing to the
pda, it will then also point to the per cpu variables and can be
accessed thusly:

%gs:[&per_cpu_xxxx - __per_cpu_start]

* x86_64: Rebase per cpu variables to zero

Take advantage of the zero-based per cpu area provided above.
Then we can directly use the x86_32 percpu operations. x86_32
offsets %fs by __per_cpu_start. x86_64 has %gs pointing directly
to the pda and the per cpu area thereby allowing access to the
pda with the x86_64 pda operations and access to the per cpu
variables using x86_32 percpu operations.

The bulk of this series is pda_X to x86_X_percpu conversion. This seems like pointless churn to me; there's nothing inherently wrong with the pda_X interfaces, and doing this transformation doesn't get us any closer to unifying 32 and 64 bit.

I think we should start devolving things out of the pda in the other direction: make a series where each patch takes a member of struct x8664_pda, converts it to a per-cpu variable (where possible, the same one that 32-bit uses), and updates all the references accordingly. When the pda is as empty as it can be, we can look at removing the pda-specific interfaces.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/