Re: Why does glibc use AVX-512?

From: Andrew Cooper
Date: Fri Mar 26 2021 - 16:14:59 EST


On 26/03/2021 19:47, Andy Lutomirski wrote:
> On Fri, Mar 26, 2021 at 12:34 PM Florian Weimer <fw@xxxxxxxxxxxxx> wrote:
>> * Andy Lutomirski:
>>
>>>>> AVX-512 cleared, and programs need to explicitly request enablement.
>>>>> This would allow programs to opt into not saving/restoring across
>>>>> signals or to save/restore in buffers supplied when the feature is
>>>>> enabled.
>>>> Isn't XSAVEOPT already able to handle that?
>>>>
>>> Yes, but we need a place to put the data, and we need to acknowledge
>>> that, with the current save-everything-on-signal model, the amount of
>>> time and memory used is essentially unbounded. This isn't great.
>> The size has to have a known upper bound, but the save amount can be
>> dynamic, right?
>>
>> How was the old lazy FPU initialization support for i386 implemented?
>>
>>>> Assuming you can make XSAVEOPT work for you on the kernel side, my
>>>> instincts tell me that we should have markup for RTM, not for AVX-512.
>>>> This way, we could avoid use of the AVX-512 registers and keep using
>>>> VZEROUPPER, without run-time transaction checks, and deal with other
>>>> idiosyncrasies needed for transaction support that users might
>>>> encounter once this feature sees more use. But the VZEROUPPER vs RTM
>>>> issues is currently stuck in some internal process issue on my end (or
>>>> two, come to think of it), which I hope to untangle next month.
>>> Can you elaborate on the issue?
>> This is the bug:
>>
>> vzeroupper use in AVX2 multiarch string functions cause HTM aborts
>> <https://sourceware.org/bugzilla/show_bug.cgi?id=27457>
>>
>> Unfortunately we have a bug (outside of glibc) that makes me wonder if
>> we can actually roll out RTM transaction checks (or any RTM
>> instruction) on a large scale:
>>
>> x86: Sporadic failures in tst-cpu-features-cpuinfo
>> <https://sourceware.org/bugzilla/show_bug.cgi?id=27398#c3>
> It's worth noting that recent microcode updates have make RTM
> considerably less likely to actually work on many parts. It's
> possible you should just disable it. :(

For a variety of errata and speculative security reasons, hypervisors
now have the ability to hide/show the HLE/RTM CPUID bits, independently
of letting TSX actually work or not.

For migration compatibility reasons, you might quite possibly find
yourself in a VM which advertises the HLE/RTM bits but will
unconditionally abort any transaction.

Honestly, if I were you, I'd just leave it to the user to explicitly opt
in if they want transactions.

~Andrew