RE: [PATCH v2 5/7] EDAC/ghes: Prevent chipset-specific edac from loading after ghes_edac is registered

From: Kani, Toshi
Date: Fri Aug 19 2022 - 14:04:42 EST


On Friday, August 19, 2022 4:35 AM, Justin He wrote:
> > > @@ -1382,6 +1395,7 @@ static int ghes_probe(struct platform_device
> > > *ghes_dev)
> > > platform_set_drvdata(ghes_dev, ghes);
> > >
> > > ghes->dev = &ghes_dev->dev;
> > > + set_ghes_devs_registered(false);
> >
> > This does not look right to me.
> >
> > The condition of using ghes_edac is (ghes-present) && (ghes-preferred),
> > where:
> > - ghes-present is latched on ghes_probe()
> > - ghes-preferred is true if the platform-check passes.
> >
> > ghes_get_device() introduced in the previous patch works as the
> > ghes-preferred check.
> >
> > We cannot use ghes_edac registration as the ghes-present check in this
> > scheme since it is deferred to module_init().
>
> What is the logic for ghes-present check? In this patch, I assumed it is equal to
> "ghes_edac devices have been registered". Seems it is not correct.

Using (ghes_edac-registered) is a wrong check in this scheme
since ghes_edac registration is deferred. This check is fine in
the current scheme since ghes_edac gets registered before
any other chipset-specific edac drivers.

> But should we consider the case as follows:
> What if sbridge_init () is invoked before ghes_edac_init()? i.e. Should we get
> sb_edac driver selected when ghes_edac is not loaded yet (e.g. on HPE)?

No. The point is that ghes_edac driver needs to be selected,
"regardless of the module ordering", on a system with GHES
present & preferred.

Note that this new scheme leads to the following condition
change:
- Current: (ghes-present) && (ghes-preferred) && (ghes_edac registered)
- New: (ghes-present) && (ghes-preferred)

The option I suggested previously keeps the current condition,
but this new scheme does not for better modularity.

What this means is that if ghes_edac is not enabled (but ghes
is enabled) on a system with GHES present & preferred, no edac
driver gets registered. This change is fine from my (HPE) perspective
and should be fine for other GHES systems. GHES systems have
chipset-specific edac driver in FW. OS-based chipset-specific edac
driver is not necessary and may lead to a conflict of chipset register
ownership.

> > I'd suggest the following changes:
> > - Update ghes_get_device() to check a flag latched on ghes_probe().
>
> Still need your elaborating about the details of the flag. i.e. When is this flag
> latched? When is it unlatched?

This flag is a static variable, say ghes_present, which is set to false initially.
ghes_probe() sets it to true. ghes_edac_preferred() (aka. ghes_get_device)
checks this flag at beginning and returns false if this flag is false. It does not
get unlatched since ACPI GHES table is static.

Toshi