Re: [PATCH v5 04/27] x86/fpu/xstate: Add XSAVES system states for shadow stack

From: Matthew Wilcox
Date: Thu Nov 08 2018 - 19:32:37 EST


On Thu, Nov 08, 2018 at 03:35:02PM -0800, Dave Hansen wrote:
> On 11/8/18 2:00 PM, Matthew Wilcox wrote:
> > struct a {
> > char c;
> > struct b b;
> > };
> >
> > we want struct b to start at offset 8, but with __packed, it will start
> > at offset 1.
>
> You're talking about how we want the struct laid out in memory if we
> have control over the layout. I'm talking about what happens if
> something *else* tells us the layout, like a hardware specification
> which is what is in play with the XSAVE instruction dictated layout
> that's in question here.
>
> What I'm concerned about is a structure like this:
>
> struct foo {
> u32 i1;
> u64 i2;
> };
>
> If we leave that to natural alignment, we end up with a 16-byte
> structure laid out like this:
>
> 0-3 i1
> 3-8 alignment gap
> 8-15 i2

I know you actually meant:

0-3 i1
4-7 pad
8-15 i2

> Which isn't what we want. We want a 12-byte structure, laid out like this:
>
> 0-3 i1
> 4-11 i2
>
> Which we get with:
>
> struct foo {
> u32 i1;
> u64 i2;
> } __packed;

But we _also_ get pessimised accesses to i1 and i2. Because gcc can't
rely on struct foo being aligned to a 4 or even 8 byte boundary (it
might be embedded in "struct a" from above).

> Now, looking at Yu-cheng's specific example, it doesn't matter. We've
> got 64-bit types and natural 64-bit alignment. Without __packed, we
> need to look out for natural alignment screwing us up. With __packed,
> it just does what it *looks* like it does.

The question is whether Yu-cheng's struct is ever embedded in another
struct. And if so, what does the hardware do?