Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size

From: Alexander Shishkin
Date: Tue Jan 24 2023 - 06:59:42 EST


Marc Zyngier <maz@xxxxxxxxxx> writes:

> On Sun, 22 Jan 2023 09:00:04 +0000,
> Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>>
>> On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
>> > A malicious device can change its MSIX table size between the table
>> > ioremap() and subsequent accesses, resulting in a kernel page fault in
>> > pci_write_msg_msix().
>> >
>> > To avoid this, cache the table size observed at the moment of table
>> > ioremap() and use the cached value. This, however, does not help drivers
>> > that peek at the PCIE_MSIX_FLAGS register directly.
>> >
>> > Signed-off-by: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
>> > Reviewed-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
>> > Cc: stable@xxxxxxxxxxxxxxx
>> > ---
>> > drivers/pci/msi/api.c | 7 ++++++-
>> > drivers/pci/msi/msi.c | 2 +-
>> > include/linux/pci.h | 1 +
>> > 3 files changed, 8 insertions(+), 2 deletions(-)
>>
>> I'm not security expert here, but not sure that this protects from anything.
>> 1. Kernel relies on working and not-malicious HW. There are gazillion ways
>> to cause crashes other than changing MSI-X.
>> 2. Device can report large table size, kernel will cache it and
>> malicious device will reduce it back. It is not handled and will cause
>> to kernel crash too.
>>
>
> Indeed, this was my exact reaction reading this patch. This only makes
> sure the same (potentially wrong) value is used at all times. So while
> this results in a consistent use, this doesn't give much guarantee.

It guarantees that the MSIX table is big enough to fit all the vectors,
so it should prevent the page faults from out-of-bounds accesses.

> The only way to deal with this is to actually handle the resulting
> fault, similar to what the kernel does when accessing userspace. Not
> sure how possible this is with something like PCIe.

Do you mean replacing MMIO accesses with exception handling accessors?
That seems like a monumental effort. And then we'd have to figure out
how to handle errors in the __pci_write_msi_msg() path.

Preventing page faults from happening in the first place seems like a
more reasonable solution, or what do you think?

Thanks,
--
Alex