Re: [PATCH RESEND percpu#for-next] percpu: align percpu readmostlysubsection to cacheline

From: Sam Ravnborg
Date: Mon Dec 27 2010 - 15:43:25 EST


On Mon, Dec 27, 2010 at 02:37:19PM +0100, Tejun Heo wrote:
> Currently percpu readmostly subsection may share cachelines with other
> percpu subsections which may result in unnecessary cacheline bounce
> and performance degradation.
>
> This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
> linker macros, makes each arch linker scripts specify its cacheline
> size and use it to align percpu subsections.
>
> This is based on Shaohua's x86 only patch.
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Shaohua Li <shaohua.li@xxxxxxxxx>
> ---
> Shaohua, can you please verify this achieves the same result? If no
> one objects, I'll route it through the percpu tree.
>
> Thanks.
>
> (arch ML address was wrong, resending with the correct one)
>
> arch/alpha/kernel/vmlinux.lds.S | 2 +-
> arch/arm/kernel/vmlinux.lds.S | 2 +-
> arch/blackfin/kernel/vmlinux.lds.S | 2 +-
> arch/cris/kernel/vmlinux.lds.S | 2 +-
> arch/frv/kernel/vmlinux.lds.S | 2 +-
> arch/ia64/kernel/vmlinux.lds.S | 2 +-
> arch/m32r/kernel/vmlinux.lds.S | 2 +-
> arch/mips/kernel/vmlinux.lds.S | 2 +-
> arch/mn10300/kernel/vmlinux.lds.S | 2 +-
> arch/parisc/kernel/vmlinux.lds.S | 2 +-
> arch/powerpc/kernel/vmlinux.lds.S | 2 +-
> arch/s390/kernel/vmlinux.lds.S | 2 +-
> arch/sh/kernel/vmlinux.lds.S | 2 +-
> arch/sparc/kernel/vmlinux.lds.S | 2 +-
> arch/tile/kernel/vmlinux.lds.S | 2 +-
> arch/um/include/asm/common.lds.S | 2 +-
> arch/x86/kernel/vmlinux.lds.S | 4 ++--
> arch/xtensa/kernel/vmlinux.lds.S | 2 +-
> include/asm-generic/vmlinux.lds.h | 35 ++++++++++++++++++++++-------------
> 19 files changed, 41 insertions(+), 32 deletions(-)
>
> diff --git a/arch/alpha/kernel/vmlinux.lds.S b/arch/alpha/kernel/vmlinux.lds.S
> index 003ef4c..173518f 100644
> --- a/arch/alpha/kernel/vmlinux.lds.S
> +++ b/arch/alpha/kernel/vmlinux.lds.S
> @@ -38,7 +38,7 @@ SECTIONS
> __init_begin = ALIGN(PAGE_SIZE);
> INIT_TEXT_SECTION(PAGE_SIZE)
> INIT_DATA_SECTION(16)
> - PERCPU(PAGE_SIZE)
> + PERCPU(64, PAGE_SIZE)
> /* Align to THREAD_SIZE rather than PAGE_SIZE here so any padding page
> needed for the THREAD_SIZE aligned init_task gets freed after init */
> . = ALIGN(THREAD_SIZE);

It would have been better to include cache.h and then use L1_CACHE_BYTES,
as the value differs for EV4.
It will work with 64 as this is the bigger of the two.

It looks like we could do this for almost all archs.
But then I am not sure if "L1_CACHE_BYTES" is the same as
a cacheline on the different archs.

Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/