Re: [PATCH v2 1/5] perf/x86/intel/lbr: use setup_clear_cpu_cap instead of clear_cpu_cap

From: Paolo Bonzini
Date: Wed Nov 02 2022 - 09:41:28 EST


On 10/20/22 10:59, Borislav Petkov wrote:
On Wed, Sep 28, 2022 at 01:49:34PM +0300, Maxim Levitsky wrote:
Patch 5 is the main fix - it makes the kernel to be tolerant to a
broken CPUID config (coming hopefully from hypervisor), where you have
a feature (AVX2 in my case) but not a feature on which this feature
depends (AVX).

I really really don't like it when people are fixing the wrong thing.

Why does the kernel need to get fixed when something else can't get its
CPUID dependencies straight? I don't even want to know why something
would set AVX2 without AVX?!?!

Users do so because they just "disable AVX" (e.g. in QEMU -cpu host,-avx) and that removes the AVX bit. Userspace didn't bother to implement the whole set of CPUID bit dependencies for AVX because:

1) Intel is adding AVX features every other week and probably half the time people would forget to add the dependency

2) anyway you absolutely need to check XCR0 before using AVX, which in the kernel is done using cpu_has_xfeatures(XFEATURE_MASK_YMM), and userspace *does* remove the XSAVE state from 0Dh leaf if you remove AVX.

(2) in particular holds even on bare metal. The kernel bug here is that X86_FEATURE_AVX only tells you if the instructions are _present_, not if they are _usable_. Indeed, the XCR0 check is present for all other files in arch/x86/crypto, either instead or in addition to boot_cpu_has(X86_FEATURE_AVX).

Maxim had sent a patch about a year ago to do it in aesni-intel-glue.c but Dave told him to fix the dependencies instead (https://lore.kernel.org/all/20211103124614.499580-1-mlevitsk@xxxxxxxxxx/). What do you think of applying that patch instead?

Thanks,

Paolo