Re: [PATCH] ARM: Do not allow unaligned accesses whenCONFIG_ALIGNMENT_TRAP

From: Will Deacon
Date: Thu May 26 2011 - 13:15:49 EST


Right... [adding bunch of people to CC],


On Wed, 2011-05-25 at 15:50 +0100, Catalin Marinas wrote:
> 2011/5/25 MÃns RullgÃrd <mans@xxxxxxxxx>:
> > Dave Martin <dave.martin@xxxxxxxxxx> writes:
> >
> >> On Wed, May 25, 2011 at 12:14:08PM +0100, Catalin Marinas wrote:
> >>> On Tue, May 24, 2011 at 06:13:31PM +0100, Dave Martin wrote:
> >>> > On Tue, May 24, 2011 at 04:26:35PM +0100, Catalin Marinas wrote:
> >>> > > BTW, are we sure that the code generated with unaligned accesses is
> >>> > > better? AFAIK, while processors support unaligned accesses, they may
> >>> > > not always be optimal.
> >>> >
> >>> > The code gcc generates to synthesise an unaligned access using aligned
> >>> > accesses is pretty simplistic:
> >>> ...
> >>> > For code which natively needs to read unaligned fields from data structures,
> >>> > I sincerely doubt that the CPU will not beat the above code for efficiency...
> >>> >
> >>> > So if there's code doing unaligned access to data structures for a good
> >>> > reason, building with unaligned access support turned on in the compiler
> >>> > seems a good idea, if that code might performance-critical for anything.
> >>>
> >>> gcc generates unaligned accesses in the the pcpu_dump_alloc_info()
> >>> function. We have a local variable like below (9 bytes):
> >>>
> >>> char empty_str[] = "--------";
> >>>
> >>> and it looks like other stack accesses are unaligned:
> >>>
> >>> c0082ba0 <pcpu_dump_alloc_info>:
> >>> c0082ba0: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
> >>> c0082ba4: e3074118 movw r4, #28952 ; 0x7118
> >>> c0082ba8: e24dd04c sub sp, sp, #76 ; 0x4c
> >>> c0082bac: e34c402a movt r4, #49194 ; 0xc02a
> >>> c0082bb0: e58d1014 str r1, [sp, #20]
> >>> c0082bb4: e58d0020 str r0, [sp, #32]
> >>> c0082bb8: e8b40003 ldm r4!, {r0, r1}
> >>> c0082bbc: e58d003f str r0, [sp, #63] <----------- !!!!!
> >>> c0082bc0: e59d0014 ldr r0, [sp, #20]
> >>> c0082bc4: e5d43000 ldrb r3, [r4]
> >>>
> >>> I haven't tried with -mno-unaligned-access, I suspect the variables on
> >>> the stack would be aligned.
> >>
> >> So, it looks like empty_str may be misaligned on the stack, and the compiler
> >> is doing a misaligned store when initialising it.
> >
> > empty_str has type char[] so there are no alignment requirements.
>
> I think the local variables after char empty_str[] are unaligned (int
> alloc). Changing the array size to 16 solves the issue.
>
> The gcc guys here in ARM will have a look and I'll get back to you.

This issue seems to be caused by passing -fconserve-stack to GCC. This
was added in 8f7f5c9f ("kbuild: set -fconserve-stack option for gcc
4.5") and as you can see from the archive:

http://lkml.org/lkml/2009/9/20/39

it was thought to only have an impact on inlining decisions. Looking at
the documentation for GCC 4.6:

-fconserve-stack
Attempt to minimize stack usage. The compiler will attempt to
use less stack space, even if that makes the program slower. This option
implies setting the âlarge-stack-frameâ parameter to 100 and the
âlarge-stack-frame-growthâ parameter to 400.

So it sounds like we might not want to enable this blindly across all
architectures. Indeed, on ARM, it encourages the compiler to pack
variables on the stack which leads to the weird and wonderful alignment
situation that has been encountered in this thread.

Can we remove -fconserve-stack from the top-level Makefile (or at least
make it conditional by architecture)?

Will

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/