Re: [PATCH v2 2/2] Intel-IOMMU, intr-remap: source-id checking

From: Eric W. Biederman
Date: Tue May 19 2009 - 15:28:54 EST


Ingo Molnar <mingo@xxxxxxx> writes:

> * Weidong Han <weidong.han@xxxxxxxxx> wrote:
>
>> To support domain-isolation usages, the platform hardware must be
>> capable of uniquely identifying the requestor (source-id) for each
>> interrupt message. Without source-id checking for interrupt
>> remapping , a rouge guest/VM with assigned devices can launch
>> interrupt attacks to bring down anothe guest/VM or the VMM itself.
>>
>> This patch adds source-id checking for interrupt remapping, and
>> then really isolates interrupts for guests/VMs with assigned
>> devices.
>>
>> Because PCI subsystem is not initialized yet when set up IOAPIC
>> entries, use read_pci_config_byte to access PCI config space
>> directly.
>>
>> Signed-off-by: Weidong Han <weidong.han@xxxxxxxxx>
>> ---
>> arch/x86/kernel/apic/io_apic.c | 6 +++
>> drivers/pci/intr_remapping.c | 90 ++++++++++++++++++++++++++++++++++++++-
>> drivers/pci/intr_remapping.h | 2 +
>> include/linux/dmar.h | 11 +++++
>> 4 files changed, 106 insertions(+), 3 deletions(-)
>
> Code structure looks nice now. (and i susect you have tested this on
> real and relevant hardware?) I've Cc:-ed Eric too ... does this
> direction look good to you too Eric?

Being a major nitpick, I have to point out that the code is not
structured to support other iommus, and I think AMD has one that can
do this as well.

The early pci reading of the bus is just wrong. What happens if the
pci layer decided to renumber things? It looks like we have a real
dependency on pci there and are avoiding sorting it out with this.

Hmm. But that is what we use in setup_ioapic_sid....
I expect the right solution is to delay enabling ioapic entries
until driver enable them. That could also reduce screaming
irqs during bootup in the kdump case.

set_msi_sid looks wrong. The comment are unhelpful. irte->svt should
get an enum value or a deine (removing the repeated explanations of
the magic value) and then we could have room to explain why we
are doing what we are doing.

Not finding an upstream pcie_bridge and then concluding we are a pcie
device seems bogus.

Why if we do have an upstream pcie bridge do we only want to do a bus
range verification instead of checking just for the bus :devfn?

The legacy PCI case seems even stranger.

....

The table of apic information by apic_id also seems wrong. Don't
we have chip_data or something that should point it that we can
get from the irq?

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/