Re: 2.6.6-rc3-mm2

From: Arnd Bergmann
Date: Wed May 05 2004 - 13:02:01 EST


On Wednesday 05 May 2004 17:33, Christoph Hellwig wrote:
> On Wed, May 05, 2004 at 01:31:35AM -0700, Andrew Morton wrote:
> >static-define_per_cpu-vs-modules-2.patch
> >
> > Work around relative address displacement problems on s390.
>
> I'm not happy with this one. It prevents perfectly valid optimizations
> that become more important with modern compilers because of s390 brokenness.
>
> Of the options listed in the patch description (why the heck didn't it ever
> make it to a mailinglist??) the options to avoid the static effect in s390
> code looks most appealing but hard to implement to me, and if it doesn't
> work out we'll probably have to do the STATIC_DEFINE_PER_CPU variant.

When I prepared that patch, I was certain that the bug was not s390
specific, as it now turned out to be. The reason is that only on s390, the
kernel virtual address space for builtin code is the same as the physical
address space while modules are loaded to the vmalloc area, which may be >32
bit apart.

For exported symbols, we work around this by compiling modules with -fpic,
which does not work for per_cpu relocations.

Martin was now able to do a similar workaround for these by loading the
address through inline assembly without affecting the other architectures,
following a suggestion by Richard Henderson. This is his patch:

Force the use of a 64 bit relocation to access the per_cpu__##var
variables and fix a problem in the module loader regarding GOTENT and
GOTPLTENT relocs.

diff -urN linux-2.6/arch/s390/kernel/module.c linux-2.6-s390/arch/s390/kernel/module.c
--- linux-2.6/arch/s390/kernel/module.c Sun Apr 4 05:38:13 2004
+++ linux-2.6-s390/arch/s390/kernel/module.c Wed May 5 19:40:22 2004
@@ -277,7 +277,8 @@
*(unsigned int *) loc = val;
else if (r_type == R_390_GOTENT ||
r_type == R_390_GOTPLTENT)
- *(unsigned int *) loc = val >> 1;
+ *(unsigned int *) loc =
+ (val + (Elf_Addr) me->module_core - loc) >> 1;
else if (r_type == R_390_GOT64 ||
r_type == R_390_GOTPLT64)
*(unsigned long *) loc = val;
diff -urN linux-2.6/include/asm-s390/percpu.h linux-2.6-s390/include/asm-s390/percpu.h
--- linux-2.6/include/asm-s390/percpu.h Sun Apr 4 05:38:20 2004
+++ linux-2.6-s390/include/asm-s390/percpu.h Wed May 5 19:40:22 2004
@@ -5,10 +5,26 @@
#include <asm/lowcore.h>

/*
- * s390 uses the generic implementation for per cpu data, with the exception that
- * the offset of the cpu local data area is cached in the cpu's lowcore memory
+ * For builtin kernel code s390 uses the generic implementation for
+ * per cpu data, with the exception that the offset of the cpu local
+ * data area is cached in the cpu's lowcore memory
+ * For 64 bit module code s390 forces the use of a GOT slot for the
+ * address of the per cpu variable. This is needed because the module
+ * may be more than 4G above the per cpu area.
*/
+#if defined(__s390x__) && defined(MODULE)
+#define __get_got_cpu_var(var,offset) \
+ (*({ unsigned long *__ptr; \
+ asm ( "larl %0,per_cpu__"#var"@GOTENT" : "=a" (__ptr) ); \
+ ((typeof(&per_cpu__##var))((*__ptr) + offset)); \
+ }))
+#undef __get_cpu_var
+#define __get_cpu_var(var) __get_got_cpu_var(var,S390_lowcore.percpu_offset)
+#undef per_cpu
+#define per_cpu(var,cpu) __get_got_cpu_var(var,__per_cpu_offset[cpu])
+#else
#undef __get_cpu_var
#define __get_cpu_var(var) (*RELOC_HIDE(&per_cpu__##var, S390_lowcore.percpu_offset))
+#endif

#endif /* __ARCH_S390_PERCPU__ */

Attachment: pgp00000.pgp
Description: signature