Re: [PATCH v9 4/6] ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3

From: Jonathan Cameron
Date: Fri Aug 21 2020 - 12:33:08 EST


On Fri, 21 Aug 2020 09:59:23 -0500
Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:

> On Fri, Aug 21, 2020 at 08:46:22AM -0500, Bjorn Helgaas wrote:
> > On Fri, Aug 21, 2020 at 01:59:01PM +0100, Jonathan Cameron wrote:
> > > On Fri, 21 Aug 2020 07:13:56 -0500
> > > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > >
> > > > [+cc Keith, author of 3accf7ae37a9 ("acpi/hmat: Parse and report
> > > > heterogeneous memory")]
> > > >
> > > > On Fri, Aug 21, 2020 at 09:42:58AM +0100, Jonathan Cameron wrote:
> > > > > On Thu, 20 Aug 2020 17:21:29 -0500
> > > > > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > > > >
> > > > > > On Wed, Aug 19, 2020 at 10:51:09PM +0800, Jonathan Cameron wrote:
> > > > > > > In ACPI 6.3, the Memory Proximity Domain Attributes Structure
> > > > > > > changed substantially. One of those changes was that the flag
> > > > > > > for "Memory Proximity Domain field is valid" was deprecated.
> > > > > > >
> > > > > > > This was because the field "Proximity Domain for the Memory"
> > > > > > > became a required field and hence having a validity flag makes
> > > > > > > no sense.
> > > > > > >
> > > > > > > So the correct logic is to always assume the field is there.
> > > > > > > Current code assumes it never is.
> > > > > > >
> > > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > > > > > ---
> > > > > > > drivers/acpi/numa/hmat.c | 2 +-
> > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
> > > > > > > index 2c32cfb72370..07cfe50136e0 100644
> > > > > > > --- a/drivers/acpi/numa/hmat.c
> > > > > > > +++ b/drivers/acpi/numa/hmat.c
> > > > > > > @@ -424,7 +424,7 @@ static int __init hmat_parse_proximity_domain(union acpi_subtable_headers *heade
> > > > > > > pr_info("HMAT: Memory Flags:%04x Processor Domain:%u Memory Domain:%u\n",
> > > > > > > p->flags, p->processor_PD, p->memory_PD);
> > > > > > >
> > > > > > > - if (p->flags & ACPI_HMAT_MEMORY_PD_VALID && hmat_revision == 1) {
> > > > > > > + if ((p->flags & ACPI_HMAT_MEMORY_PD_VALID && hmat_revision == 1) || hmat_revision == 2) {
> > > > > >
> > > > > > I hope/assume the spec is written in such a way that p->memory_PD is
> > > > > > required for any revision > 1? So maybe this should be:
> > > > > >
> > > > > > if ((p->flags & ACPI_HMAT_MEMORY_PD_VALID && hmat_revision == 1) ||
> > > > > > hmat_revision > 1) {
> > > >
> > > > I should have said simply:
> > > >
> > > > if (hmat_revision == 1 && p->flags & ACPI_HMAT_MEMORY_PD_VALID)
> > > >
> > > > We shouldn't even test p->flags for ACPI_HMAT_MEMORY_PD_VALID unless
> > > > we already know it's revision 1.
> > > >
> > > > And unless there was a revision 0 of HMAT, there's no need to look for
> > > > hmat_revison > 1.
> > >
> > > It needs to stay as an or statement as you had the first time.
> > > The field is always valid for hmat_revision > 1, and valid for
> > > hmat_revision == 1 with the flag set. You could express it as
> > >
> > > if ((p->flags & ACPI_HMAT_MEMORY_PD_VALID) || (hmat_revision != 1))
> > >
> > > but that seems more confusing to me.
> >
> > Oh, you're right, sorry! There are two questions here:
> >
> > 1) In what order should we test "p->flags & ACPI_HMAT_MEMORY_PD_VALID"
> > and "hmat_revision == 1"? ACPI_HMAT_MEMORY_PD_VALID is defined
> > only when "hmat_revision == 1", so I think we should test the
> > revision first.
> >
> > When "hmat_revision == 2", ACPI_HMAT_MEMORY_PD_VALID is reserved,
> > so we shouldn't test it, even if we later check the revision and
> > discard the result of the flag test. This is a tiny thing,
> > admittedly, but I think it follows the spec more clearly.
> >
> > 2) Do we need to test hmat_revision for anything other than 1? Yes,
> > you're right, see below.
> >
> > > > > Good point. We have existing protections elsewhere against
> > > > > hmat_revision being anything other than 1 or 2, so we should aim to
> > > > > keep that in only one place.
> > > >
> > > > I think the "Ignoring HMAT: Unknown revision" test in hmat_init(),
> > > > added by 3accf7ae37a9 ("acpi/hmat: Parse and report heterogeneous
> > > > memory"), is a mistake.
> > > >
> > > > And I think hmat_normalize() has a similar mistake in that it tests
> > > > explicitly for hmat_revision == 2 when it should accept 2 AND anything
> > > > later.
> > > >
> > > > We should assume that future spec revisions will be backwards
> > > > compatible. Otherwise we're forced to make kernel changes when we
> > > > otherwise would not have to.
> > >
> > > I disagree with this. There is no rule in ACPI about maintaining
> > > backwards compatibility. The assumption is that the version number
> > > will always be checked. The meaning of fields changed between
> > > version 1 and version 2 so it would be bold to assume that won't
> > > happen in the future!
> >
> > There *is* a rule about maintaining backwards compatibility. ACPI
> > v6.3, sec 5.2.2, says:
> >
> > All versions of the ACPI tables must maintain backward
> > compatibility. To accomplish this, modifications of the tables
> > consist of redefinition of previously reserved fields and values
> > plus appending data to the 1.0 tables. Modifications of the ACPI
> > tables require that the version numbers of the modified tables be
> > incremented.
> >
> > > HMAT is an optional table, so if someone boots up an old kernel
> > > they are probably better off failing to use it at all than
> > > misinterpreting it.
> >
> > An old kernel tests:
> >
> > if (p->flags & ACPI_HMAT_MEMORY_PD_VALID && hmat_revision == 1)
> > target = find_mem_target(p->memory_PD);
> >
> > which is fine on old firmware. On new firmware (hmat_revision == 2),
> > it will ignore p->memory_PD. That is probably a problem, but I think
> > we should check for that at the place where we need a memory_PD and
> > don't find one. That's more general than sanity checking a revision.
> >
> > A new kernel that tests:
> >
> > if ((hmat_revision == 1 && p->flags & ACPI_HMAT_MEMORY_PD_VALID) ||
> > hmat_revision > 1)
> > target = find_mem_target(p->memory_PD);
> >
> > will do the right thing on both old and new firmware.
>
> Actually, I think this part of the spec was done incorrectly.
>
> ACPI v6.3 could have made the p->memory_PD field required without
> changing the definition of ACPI_HMAT_MEMORY_PD_VALID. What value was
> gained by making ACPI_HMAT_MEMORY_PD_VALID a reserved bit in v6.3?
>
> If they had left ACPI_HMAT_MEMORY_PD_VALID alone, the Linux code could
> have been simply this, which would work with old firmware and new
> firmware, and we wouldn't have to touch this at all:
>
> if (p->flags & ACPI_HMAT_MEMORY_PD_VALID)
> target = find_mem_target(p->memory_PD);

I have a slight recollection that might have been my fault :) Oops.

Jonathan

>
> Bjorn