Re: [PATCH v2] iommu/amd: Set translation valid bit only when IO page tables are in used

From: Suravee Suthikulpanit
Date: Mon May 16 2022 - 08:28:07 EST


Joerg,

On 5/13/22 8:07 PM, Joerg Roedel wrote:
On Mon, May 09, 2022 at 02:48:15AM -0500, Suravee Suthikulpanit wrote:
On AMD system with SNP enabled, IOMMU hardware checks the host translation
valid (TV) and guest translation valid (GV) bits in the device
table entry (DTE) before accessing the corresponded page tables.

However, current IOMMU driver sets the TV bit for all devices
regardless of whether the host page table is in used.
This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which
do not the host page table root pointer set up.

Hmm, this sound weird. In the early AMD IOMMUs it was recommended to set
TV=1 and V=1 and the rest to 0 to block all DMA from a device.

I wonder how this triggers ILLEGAL_DEV_TABLE_ENTRY errors now. It is
(was?) legal to set V=1 TV=1, mode=0 and leave the page-table empty.

Due to the new restriction (please see the IOMMU spec Rev 3.06-PUB - Apr 2021
https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf) where the use of
DTE[Mode]=0 is not supported on systems that are SNP-enabled (i.e. EFR[SNPSup]=1),
the IOMMU HW looks at the DTE[TV] bit to determine if it needs to handle the v1 page table.
When the HW encounters DTE entry with TV=1, V=1, Mode=0, it would generate
ILLEGAL_DEV_TABLE_ENTRY event.

Note: I am following up with HW folks for the updated document for this
specific detail.

Therefore, we need to modify IOMMU driver as following:

- For non-DMA devices (e.g. the IOAPIC devices), we need to
modify IOMMU driver to default to DTE[TV]=0. For Linux, this is equivalent
to DTE with domain ID 0.

- I am still trying to see what is the best way to force Linux to not allow
Mode=0 (i.e. iommu=pt mode). Any thoughts?

- Also, it seems that the current iommu v2 page table use case, where GVA->GPA=SPA
will no longer be supported on system w/ SNPSup=1. Any thoughts?

When then IW=0 and IR=0, DMA is blocked. From what I remember this is a
valid setting in a DTE.

Correct.

Do you have an example DTE which triggers this error message?

This is specifically from the device representing an IOAPIC.

[ +0.000108] iommu ivhd0: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=c0:00.1 pasid=0x00000 address=0xfffffffdf8140000 flags=0x0008]
[ +0.000011] AMD-Vi: DTE[0]: 0000000000000003
[ +0.000003] AMD-Vi: DTE[1]: 0000000000000000
[ +0.000002] AMD-Vi: DTE[2]: 2008000100258013
[ +0.000001] AMD-Vi: DTE[3]: 0000000000000000

Best Regards,
Suravee