Re: [PATCH v2] Subject: x86/PAT: Report PAT on CPUs that support PAT without MTRR

From: Juergen Gross
Date: Fri Jul 15 2022 - 00:23:03 EST


On 15.07.22 04:19, Chuck Zmudzinski wrote:
On 7/14/2022 1:40 AM, Juergen Gross wrote:
On 13.07.22 03:36, Chuck Zmudzinski wrote:
The commit 99c13b8c8896d7bcb92753bf
("x86/mm/pat: Don't report PAT on CPUs that don't support it")
incorrectly failed to account for the case in init_cache_modes() when
CPUs do support PAT and falsely reported PAT to be disabled when in
fact PAT is enabled. In some environments, notably in Xen PV domains,
MTRR is disabled but PAT is still enabled, and that is the case
that the aforementioned commit failed to account for.

As an unfortunate consequnce, the pat_enabled() function currently does
not correctly report that PAT is enabled in such environments. The fix
is implemented in init_cache_modes() by setting pat_bp_enabled to true
in init_cache_modes() for the case that commit 99c13b8c8896d7bcb92753bf
("x86/mm/pat: Don't report PAT on CPUs that don't support it") failed
to account for.

This approach arranges for pat_enabled() to return true in the Xen PV
environment without undermining the rest of PAT MSR management logic
that considers PAT to be disabled: Specifically, no writes to the PAT
MSR should occur.

This patch fixes a regression that some users are experiencing with
Linux as a Xen Dom0 driving particular Intel graphics devices by
correctly reporting to the Intel i915 driver that PAT is enabled where
previously it was falsely reporting that PAT is disabled. Some users
are experiencing system hangs in Xen PV Dom0 and all users on Xen PV
Dom0 are experiencing reduced graphics performance because the keying of
the use of WC mappings to pat_enabled() (see arch_can_pci_mmap_wc())
means that in particular graphics frame buffer accesses are quite a bit
less performant than possible without this patch.

Also, with the current code, in the Xen PV environment, PAT will not be
disabled if the administrator sets the "nopat" boot option. Introduce
a new boolean variable, pat_force_disable, to forcibly disable PAT
when the administrator sets the "nopat" option to override the default
behavior of using the PAT configuration that Xen has provided.

For the new boolean to live in .init.data, init_cache_modes() also needs
moving to .init.text (where it could/should have lived already before).

Fixes: 99c13b8c8896d7bcb92753bf ("x86/mm/pat: Don't report PAT on CPUs that don't support it")
Co-developed-by: Jan Beulich <jbeulich@xxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Chuck Zmudzinski <brchuckz@xxxxxxx>
---
v2: *Add force_pat_disabled variable to fix "nopat" on Xen PV (Jan Beulich)
*Add the necessary code to incorporate the "nopat" fix
*void init_cache_modes(void) -> void __init init_cache_modes(void)
*Add Jan Beulich as Co-developer (Jan has not signed off yet)
*Expand the commit message to include relevant parts of the commit
message of Jan Beulich's proposed patch for this problem
*Fix 'else if ... {' placement and indentation
*Remove indication the backport to stable branches is only back to 5.17.y

I think these changes address all the comments on the original patch

I added Jan Beulich as a Co-developer because Juergen Gross asked me to
include Jan's idea for fixing "nopat" that was missing from the first
version of the patch.

The patch has been tested, it works as expected with and without nopat
in the Xen PV Dom0 environment. That is, "nopat" causes the system to
exhibit the effects and problems that lack of PAT support causes.

arch/x86/mm/pat/memtype.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index d5ef64ddd35e..10a37d309d23 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -62,6 +62,7 @@
static bool __read_mostly pat_bp_initialized;
static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
+static bool __initdata pat_force_disabled = !IS_ENABLED(CONFIG_X86_PAT);
static bool __read_mostly pat_bp_enabled;
static bool __read_mostly pat_cm_initialized;
@@ -86,6 +87,7 @@ void pat_disable(const char *msg_reason)
static int __init nopat(char *str)
{
pat_disable("PAT support disabled via boot option.");
+ pat_force_disabled = true;
return 0;
}
early_param("nopat", nopat);
@@ -272,7 +274,7 @@ static void pat_ap_init(u64 pat)
wrmsrl(MSR_IA32_CR_PAT, pat);
}
-void init_cache_modes(void)
+void __init init_cache_modes(void)
{
u64 pat = 0;
@@ -292,7 +294,7 @@ void init_cache_modes(void)
rdmsrl(MSR_IA32_CR_PAT, pat);
}
- if (!pat) {
+ if (!pat || pat_force_disabled) {

Can we just remove this modification and ...

/*
* No PAT. Emulate the PAT table that corresponds to the two
* cache bits, PWT (Write Through) and PCD (Cache Disable).
@@ -313,6 +315,16 @@ void init_cache_modes(void)
*/
pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
+ } else if (!pat_bp_enabled) {

... use

+ } else if (!pat_bp_enabled && !pat_force_disabled) {

here?

This will result in the desired outcome in all cases IMO: If PAT wasn't
disabled via "nopat" and the PAT MSR has a non-zero value (from BIOS or
Hypervisor) and PAT has been disabled implicitly (e.g. due to lack of
MTRR), then PAT will be set to "enabled" again.

With that, you can also completely remove the new Boolean - it
will be a meaningless variable wasting memory. This will also make

No, it is making a difference with "nopat" having been specified.

In the Xen PV case we will have pat_bp_enabled == false due to the
lack of MTRR. We don't want to set it to true if "nopat" has been
specified on the command line, so pat_force_disabled should not be
true when we are setting pat_bp_enabled to true again.

my patch more or less do what Jan's patch does - the "nopat" option
will not cause the situation when the PAT MSR does not match the
software view. So you are basically proposing just going back to
my original patch, after fixing the style problems, of course. That
also would solve the problem of needing Jan's S-o-b. I am inclined,
however, to wait for a maintainer who has power to actually do the
commit, to make a comment. Your R-b to my v2 did not have much clout
with the actual maintainers, as far as I can tell. I am somewhat annoyed
that it was at your suggestion that my v2 ended up confusing the
main issue, the regression, with the red herring of the "nopat"
option.

I'm sorry for that.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature