x86: xsave/xrstor support; ucontext_t extensions

From: H. Peter Anvin
Date: Thu Jun 05 2008 - 20:39:50 EST


Sorry for not getting back on this for so long.

I have looked at the XSAVE architecture, and it is pretty darn hideous, mostly because it doesn't describe itself in the absence of CPUID information. Given that, it would have been much better if there had been separate invocations of XSAVE for each substate region. On the other hand, normalizing to the current CPU format is probably desirable anyway.

I would like to make this proposal for the signal frames (again, flagged with a uc_flags bit for RT frames):

- The SW-reserved areas at the bottom of the FXSAVE region will be used as follows:
- A magic number (M1)
- A length pointer (L1), giving the length of the entire XSAVE region.

- At the end of the XSAVE region, i.e. at the offset given by the length pointer, we create a secondary structure looking something like this:

- Magic number (M2)
- Descriptor count (DC)
- DC * <EBX, EAX> from CPUID leaf 0Dh
- Possibly a checksum or CRC of this structure

Note that this tail structure will always be the same on a given kernel, so it can be pre-canned at boot time. This tail structure serves two purposes:

- It can be used to verify against truncation of the state.
(I.e. if an XSAVE-unaware application tries to copy and save away
a state and later restore it, but only copies the first 512 bytes
and later just puts a pointer to it.)
- It can be used to verify against an alien state (saved and restored
from another CPU, or even just another kernel version with different
support.)

If there is a mismatch, we can then take appropriate action:

- No M1 or M2 signature, or L1 or DC are insane:
-> Reinitialize any non-FXSAVE state.

- M1, M2, L1, and DC make sense, but mismatch on DC or descriptor
offsets:
-> Do a region-by-region copy in of the state; reinitialize any
regions not provided.

- Mismatch on descriptor sizes:
-> Consider that region corrupt and reinitialize?

The region-by-region copy could of course be used even in the same-CPU case, if there turns out to be a negible performance difference over whole-block copy.

Thoughts?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/