Re: [PATCH 0/2] PCI/AER: Consistently use _OSC to determine who owns AER

From: Sinan Kaya
Date: Mon Nov 19 2018 - 14:32:13 EST


UEFI HEST table specification also claims that it should be the ultimate
table for when PCI firmware-first should be disabled/enabled.

IIRC, EFI absorbed ACPI before FFS was a thing. Could you point me to the UEFI chapter that says HEST is authoritative?
(not being a smartie, just that my free software PDF readers can't search within a file that large)


ACPI 6.2:

18.3.2.4 PCI Express Root Port AER Structure

Flags:

Bit [0] - FIRMWARE_FIRST: If set, this bit indicates to the OSPM that system firmware will handle errors from this source first.
Bit [1] - GLOBAL: If set, indicates that the settings contained in this structure apply globally to all PCI Express Devices.
All other bits must be set to zero.

It doesn't say shall, may or might. It says will.


I think somebody needs to fix these. I saw an email from Harb Abdulhamid
sent to aswg this morning.

That's why I suggested circulating this idea in UEFI forums first.
Let's see what everybody thinks. We can go from there.

However you look at it, we have a glaring inconsistency in how we handle AER control in linux. I'm surprised we didn't see huge issues because of mixing HEST/_OSC.

What systems rely on the HEST definition as opposed to _OSC? It doesn't make sense to me that you could have a system with mixed FFS and native AER on the same root port. The granularity of HEST shouldn't matter here because of how AER works.

I think It depends on your PCI topology.

For other topologies with multiple PCI root complexes, I can see this being
used per root complex flag to indicate which root complex needs firmware first
and which one doesn't.


I'd like see how exactly we break one of those elusive systems with _OSC. I suspect _OSC and HEST end up having the same information, and that's why we didn't see any real-life issue with mixing the approaches.

I'm already aware of two systems that rely on HEST table to pass information to
the OS that firmware first is enabled. Both of the systems do not change their
_OSC bits during this assuming HEST table has priority over _OSC for firmware
first.

If we add this patch, OS will try to claim the AER address space while firmware
wants exclusive access.

As I said in my previous email, the right place to talk about this is UEFI
forum.


Alex


P.S.
(SARCASM ALERT) Isn't UEFI is a pile of stuff that didn't stick to the wall?

P.P.S
I remember someone asking why we don't disable FFS in the BIOS. I think it would be next to impossible to get certain platform vendors to relinquish FFS control, unless the specs required things to be that way -- and had a "standard" way to do so.

Then getting the specs to change in that direction is also a battle.