Re: [PATCH v3] i2c: i801: Safely share SMBus with BIOS/ACPI

From: Jean Delvare
Date: Sun Dec 18 2022 - 18:17:23 EST


On Sat, 17 Dec 2022 22:28:25 +0900, Hector Martin wrote:
> On 15/12/2022 23.26, Jean Delvare wrote:
> > Question: why do you check that the first action of ACPI AML is to
> > acquire the hardware lock *only* if that action was attempted while the
> > Linux side was performing a transfer? That event is quite unlikely to
> > happen. Can't we perform that test unconditionally the very first time
> > i801_acpi_io_handler() is called? Unless the i2c-i801 driver gets> initialized in the middle of an ACPI-side SMBus transfer [...]
>
> I wonder if there's some way to close this race? This doesn't sound all
> that unlikely (consider: ACPI backlight is also a module, device init
> happens in parallel, so we could well end up probing i2c-i801 in the
> middle of an ACPI SMBus transfer more often than you'd expect at boot time).
>
> How about this: instead of checking for the lock access on the *first*
> call to i801_acpi_io_handler, we add an acpi_must_lock flag. This is
> initially false, but it is set on completion of every Linux-side i2c
> transfer, and cleared once we see ACPI take the lock properly. The ACPI
> handler then checks it and expects a lock access if it is true.
>
> So as soon as Linux does an i2c transfer, we expect ACPI to be taking
> the lock next time it touches the hardware, and we know to bail on the
> Linux side in the future if it does not.
>
> * If the i2c driver probes in the middle of a well-behaved ACPI SMBus
> transfer, nothing bad happens, since if we try to do a Linux-side
> transfer it will block on the lock until ACPI is done. Further ACPI
> SMBus transfers after a Linux transfer will pass the locking check.
> * If the driver probes in the middle of a misbehaved ACPI transfer but
> is otherwise idle, nothing happens until the first Linux transfer, then
> the next ACPI access after that will have us bail and disable driver access.
> * If we probe *and* try to make a transfer all in the middle of a
> misbehaved ACPI transfer, then we are going to step on its toes and
> break it, but at least we will notice as soon as the Linux side is done
> (or possibly failed due to the collision) and ACPI tries to touch the
> controller again, so we will get out of its way in the future and
> there's at least a chance it will recover for future accesses.

I see what you want to do and I think it should work.

> Further closing that last edge case to avoid ever conflicting with
> broken ACPI implementations would require some sort of mechanism to know
> whether ACPI AML started running an i2c transfer method before the
> i2c-i801 driver loaded, which might be too intrusive a change to be wrth
> it for such a corner case. Though maybe there's an easy way? If there's
> something like a global AML lock we could just have the probe sequence be:
>
> 1. Register the ACPI IO handler
> 2. Take the AML lock
> 3. Set acpi_must_lock = true
> 4. Release the AML lock
> 5. Finally register the i2c controller
>
> That makes sure we serialize on any ongoing ACPI shenanigans at probe
> time and allows us to truly check that the first access from ACPI after
> that is a lock, before Linux has a chance to do anything itself. But I
> don't know off the top of my head whether there is such a lock.

I don't think there is, but that would be a question for the ACPI list.
Anyway, for it to be useful, it would have a to be a high-level lock
(taken before starting an ACPI-side SMBus transfer, released after the
ACPI-side SMBus transfer has been fully processes). If it is taken for
every I/O, or even if it isn't held while waiting for the transfer to
complete, it won't solve the problem. And I suspect that if such a
high-level lock existed, we would have been using it in the first place
to guarantee exclusive access to the SMBus controller.

The most important thing is to get exclusive access to work properly
for well-behaved ACPI implementations. I know that the Linux driver
hasn't been a good citizen with the hardware lock until you partially
fixed that 1.5 year ago, but I hope that other operating systems did
that earlier, which would have encouraged well-behaved ACPI
implementations. So hopefully misbehaving ACPI implementations aren't
many on recent systems.

--
Jean Delvare
SUSE L3 Support