Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

From: David Coe
Date: Wed May 05 2021 - 06:24:45 EST


Hi, once more!

On 04/05/2021 07:52, Suravee Suthikulpanit wrote:
On certain AMD platforms, when the IOMMU performance counter source
(csource) field is zero, power-gating for the counter is enabled, which
prevents write access and returns zero for read access.

This can cause invalid perf result especially when event multiplexing
is needed (i.e. more number of events than available counters) since
the current logic keeps track of the previously read counter value,
and subsequently re-program the counter to continue counting the event.
With power-gating enabled, we cannot gurantee successful re-programming
of the counter.

Workaround this issue by :

1. Modifying the ordering of setting/reading counters and enabing/
disabling csources to only access the counter when the csource
is set to non-zero.

2. Since AMD IOMMU PMU does not support interrupt mode, the logic
can be simplified to always start counting with value zero,
and accumulate the counter value when stopping without the need
to keep track and reprogram the counter with the previously read
counter value.

This has been tested on systems with and without power-gating.


Just as a final, sanity check I've loaded the same patched kernel 5.11.0-16 on to an old AMD Athlon FX8350. So far, all seems in order: it loads IOMMUv1 and runs Ubuntu 21.04 without incident!

Much appreciate all your efforts, Suravee, Alex et al. Best regards.

--
David