Re: [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering

From: Chang S. Bae
Date: Fri Apr 01 2022 - 18:14:20 EST


On 4/1/2022 11:16 AM, Dave Hansen wrote:
On 4/1/22 10:58, Dave Hansen wrote:
On 3/24/22 19:22, Chang S. Bae wrote:
Some old GCC versions (4.9.x and 5.x) have an issue that incorrectly
reordering volatile asm statements with each other [1]. While this bug was
fixed on later versions (8.1, 7.3, and 6.5), and the kernel's current XCR0
read/write do not appear to be impacted, it is preventive to ensure them on
the program order.
I thought you *actually* saw xgetbv() be reordered, though. Was that on
a buggy compiler?

No, sorry, my earlier analysis was naive. The #UD was raised on the non-XGETBV1 system because Objtool didn't process fpu_state_size_dynamic() at build time. It was only when the TILERELEASE opcode was not given. Here is a bit more detail:
https://lore.kernel.org/lkml/aa3ff0f4-d74c-2c84-37c0-0880cabc0dd4@xxxxxxxxx/


Actually, reading the gcc bug[1] a bit more, it says:

However, correctness here depends on the compiler honouring the
ordering of volatile asm statements with respect to other volatile
code, and this patch fixes that.
In your case, there was presumably no volatile code, just a
fpu_state_size_dynamic() call. The compiler thought the xgetbv() was
safe to reorder, and did so.

So, I'm not quite sure how that bug is relevant. It just dealt with
multiple "asm volatile" statements that don't have memory clobbers. We
only have one "asm volatile" in play. Adding the memory clobber should
make that bug totally irrelevant.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82602

Yeah, now this patch looks to have created more confusion than fixing any real problem. Let me drop this on v4.

Thanks,
Chang