Re: [RESEND PATCH] perf/x86/intel: Fix unchecked MSR access error for Alder Lake N
From: Andi Kleen
Date: Mon Aug 22 2022 - 10:31:51 EST
On 8/22/2022 3:48 PM, Peter Zijlstra wrote:
On Mon, Aug 22, 2022 at 09:28:31AM -0400, Liang, Kan wrote:
On 2022-08-19 10:38 a.m., Peter Zijlstra wrote:
On Thu, Aug 18, 2022 at 11:15:30AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
For some Alder Lake N machine, the below unchecked MSR access error may be
triggered.
[ 0.088017] rcu: Hierarchical SRCU implementation.
[ 0.088017] unchecked MSR access error: WRMSR to 0x38f (tried to write
0x0001000f0000003f) at rIP: 0xffffffffb5684de8 (native_write_msr+0x8/0x30)
[ 0.088017] Call Trace:
[ 0.088017] <TASK>
[ 0.088017] __intel_pmu_enable_all.constprop.46+0x4a/0xa0
FWIW, I seem to get the same error when booting KVM on my ADL. I'm
fairly sure the whole CPUID vs vCPU thing is a trainwreck.
We will enhance the CPUID to address the issues. Hopefully, we can have
them supported in the next generation.
How!? A vCPU can readily migrate between a big and small CPU. There is
no way the guest can sanely program the (v)MSRs and expect it to work.
In principle this can be fixed by affinitizing the vcpus to their
respective type and reporting the right type, and I thought qemu was
supported to support this. But it would be certainly a much more complex
command line.
If you don't do this, architectural events should work, but yes any non
architectural will not count correctly.
I guess one way to detect this situation would be if the CPUID is
Alderlake, but there is no hybrid support reported in CPUID. Then it's
likely a situation like this and it could be special cased in the perf
tools and only show a limited event list.
-Andi