Re: [PATCH v2] Subject: x86/PAT: Report PAT on CPUs that support PAT without MTRR

From: Chuck Zmudzinski
Date: Wed Jul 13 2022 - 15:07:57 EST


On 7/13/2022 9:45 AM, Juergen Gross wrote:
> >> On 7/13/2022 6:36 AM, Chuck Zmudzinski wrote:
> >> And in addition, if we are going to backport this patch to
> >> all current stable branches, we better have a really, really,
> >> good reason for changing the behavior of "nopat" on Xen.
> >>
> >> Does such a reason exist?
> >
> > Well, the simple reason is: It doesn't work the same way under Xen
> > and non-Xen (in turn because, before my patch or whatever equivalent
> > work, things don't work properly anyway, PAT-wise). Yet it definitely
> > ought to behave the same everywhere, imo.
>
> There is Documentation/x86/pat.rst which rather clearly states, how
> "nopat" is meant to work. It should not change the contents of the
> PAT MSR and keep it just as it was set at boot time (the doc talks
> about the "BIOS" setting of the MSR, and I guess in the Xen case
> the hypervisor is kind of acting as the BIOS).
>
> The question is, whether "nopat" needs to be translated to
> pat_enabled() returning "false".

When I started working on a re-factoring effort of the logic
surrounding pat_enabled(), I noticed there are five different
reasons in the current code for setting pat_disabled to true,
which IMO is what should be a redundant variable that should
always be equal !pat_enabled() and !pat_bp_enabled, but that
unfortunately is not the case. The five reasons for setting
pat_disabled to true are given as message strings:

1. "MTRRs disabled, skipping PAT initialization too."
2. "PAT support disabled because CONFIG_MTRR is disabled in the kernel."
3. "PAT support disabled via boot option."
4. "PAT not supported by the CPU."
5. "PAT support disabled by the firmware."

The only effect of setting pat_disabled to true is to inhibit
the execution of pat_init(), but it does not inhibit the execution
of init_cache_modes(), which is for handling all these cases
when pat_init() was skipped. The Xen case is one of those
cases, so in the Xen case, pat_disabled will be true yet the
only way to fix the current regression and the five-year-old
commit is by setting pat_bp_enabled to true so pat_enabled()
will return true. So to fix the five-year-old commit, we must have

pat_enabled() != pat_disabled

Something is wrong with this logic, that is why I wanted to precede
my fix with some re-factoring that will change some variable
and function names and modify some comments before trying
to fix the five-year-old commit, so that we will never have a situation
when pat_enabled() != pat_disabled.

Chuck