Re: [PATCH v5] ACPI: Move sdei_init and ghes_init ahead to handle platform errors earlier

From: Bjorn Helgaas
Date: Wed Jan 19 2022 - 15:43:07 EST


On Wed, Jan 19, 2022 at 02:40:11PM +0800, Shuai Xue wrote:
> [+to Rafael, question about HEST/GHES/SDEI init]
>
> Hi, Bjorn,
>
> Thank you for your comments and quick reply.
>
> 在 2022/1/19 AM6:49, Bjorn Helgaas 写道:
> > On Sun, Jan 16, 2022 at 04:43:10PM +0800, Shuai Xue wrote:
> >> On an ACPI system, ACPI is initialised very early from a
> >> subsys_initcall(), while SDEI is not ready until a
> >> subsys_initcall_sync(). This patch is to reduce the time before GHES
> >> initialization.
> >>
> >> The SDEI driver provides functions (e.g. apei_sdei_register_ghes(),
> >> apei_sdei_unregister_ghes()) to register or unregister event callback
> >> for dispatcher in firmware. When the GHES driver probing, it registers
> >> the corresponding callback according to the notification type specified
> >> by GHES. If the GHES notification type is SDEI, the GHES driver will
> >> call apei_sdei_register_ghes() to register event call.
> >>
> >> When the firmware emits an event, it migrates the handling of the event
> >> into the kernel at the registered entry-point __sdei_asm_handler. And
> >> finally, the kernel will call the registered event callback and return
> >> status_code to indicate the status of event handling. SDEI_EV_FAILED
> >> indicates that the kernel failed to handle the event.
> >>
> >> Consequently, when an error occurs during kernel booting, the kernel is
> >> unable to handle and report errors until the GHES driver is initialized
> >> by device_initcall(), in which the event callback is registered. For
> >> example, when the kernel booting, the console logs many times from
> >> firmware before GHES drivers init in our platform:
> >>
> >> Trip in MM PCIe RAS handle(Intr:910)
> >> Clean PE[1.1.1] ERR_STS:0x4000100 -> 0 INT_STS:F0000000
> >> Find RP(98:1.0)
> >> --Walk dev(98:1.0) CE:0 UCE:4000
> >> ...
> >> ERROR: sdei_dispatch_event(32a) ret:-1
> >> --handler(910) end
> >
> > If I understand correctly, the firmware noticed an error, tried to
> > report it to the kernel, and is complaining because the kernel isn't
> > ready to handle it yet. And the reason for this patch is to reduce
> > these complaints from the firmware.
>
> My thoughts exactly :)
>
> > That doesn't seem like a very good reason for this patch. There is
> > *always* a window before the kernel is ready to handle events from the
> > firmware.
>
> Yes, there is always a window. But if we could do better in kernel that
> reduces the window by 90% (from 33 seconds to 3 second), why not?
>
> > Why is the firmware noticing these errors in the first place? If
> > you're seeing these complaints regularly, my guess is that either you
> > have some terrible hardware or (more likely) the firmware isn't
> > clearing some expected error condition correctly. For example, maybe
> > the Unsupported Request errors that happen while enumerating PCIe
> > devices are being reported.
> >
> > If you register the callback function, the kernel will now start
> > seeing these error reports. What happens then? Does the kernel log
> > the errors somewhere? Is that better than the current situation where
> > the firmware logs them?
>
> Yep, it is a hardware issue. The firmware only logs in console
> (ttyAMA0) and we can not see it in kernel side. After the kernel
> starts seeing these error reports, we could see EDAC/ghes and
> efi/cper detailed logs in dmesg. We did not notice the problem until
> we check the console log, which inspired us to reduce the window
> when kernel startup, so that we can see the message clearly and
> properly. I think the intuition is to check the log of dmesg, not
> the console.

> > However, I DO think that:
> >
> > - Removing acpi_hest_init() from acpi_pci_root_init(), and
> >
> > - Converting ghes_init() and sdei_init() from initcalls to explicit
> > calls
> >
> > are very good reasons to do something like this patch because HEST is
> > not PCI-specific, and IMO, explicit calls are better than initcalls
> > because initcall ordering is implicit and not well-defined within a
> > level.
>
> Haha, if the above reasons still don't convince you, I would like to
> accept yours :) Should we do it in one patch or separate it into two
> patches?

IMO, this can be done in one patch, but this would probably go via
Rafael.

> >> -static int __init ghes_init(void)
> >> +void __init ghes_init(void)
> >> {
> >> int rc;
> >>
> >> if (acpi_disabled)
> >> - return -ENODEV;
> >> + return;
> >>
> >> switch (hest_disable) {
> >> case HEST_NOT_FOUND:
> >> - return -ENODEV;
> >> + pr_info(GHES_PFX "HEST is not found!\n");
> >
> > I don't know whether this "HEST is not found" message is
> > worthwhile or not. I don't think lack of an HEST is an error, and
> > users may be alarmed. But this is an ACPI thing, so up to you and
> > Rafael.
>
> If we explicit call ghes_init(), we can't tell if ghes is
> initialized successfully based on the return value of initcall. So I
> add a info message.

When ghes_init() is an initcall and you return -ENODEV for the
HEST_NOT_FOUND case, I don't think we log any message about that, do
we? do_one_initcall() will capture and return the -ENODEV, but the
caller (do_initcall_level()) just ignores it.

> >> @@ -1495,8 +1494,7 @@ static int __init ghes_init(void)
> >> else
> >> pr_info(GHES_PFX "Failed to enable APEI firmware first mode.\n");
> >>
> >> - return 0;
> >> + return;
> >> err:
> >> - return rc;
> >> + ghes_disable = 1;
> >
> > Why do you set ghes_disable here? As far as I can tell, we will never
> > look at it again. The places we do look at it are:
> >
> > - ghes_init(): earlier in this function, so we've already done that,
> >
> > - acpi_hest_init(): we've already called that, too, and
> >
> > - acpi_bus_osc_negotiate_platform_control(): called from
> > acpi_bus_init(), which we've already called.
>
> I add it for future potential usage. Thank you for pointing it out.
> If you think it is not necessary, I will delete it in next version.

I think it is not necessary to save information that will never be
used. If you need it in the future, you can add it and the reason
will be obvious.

Bjorn