Re: [PATCH 1/4] iommu/amd: Introduce Protection-domain flag VFIO

From: Kalra, Ashish
Date: Fri Jan 20 2023 - 10:12:35 EST


On 1/19/2023 11:44 AM, Jason Gunthorpe wrote:
On Thu, Jan 19, 2023 at 02:54:43AM -0600, Kalra, Ashish wrote:
Hello Jason,

On 1/13/2023 9:33 AM, Jason Gunthorpe wrote:
On Tue, Jan 10, 2023 at 08:31:34AM -0600, Suravee Suthikulpanit wrote:
Currently, to detect if a domain is enabled with VFIO support, the driver
checks if the domain has devices attached and check if the domain type is
IOMMU_DOMAIN_UNMANAGED.

NAK

If you need weird HW specific stuff like this then please implement it
properly in iommufd, not try and randomly guess what things need from
the domain type.

All this confidential computing stuff needs a comprehensive solution,
not some piecemeal mess. How can you even use a CC guest with VFIO in
the upstream kernel? Hmm?


Currently all guest devices are untrusted - whether they are emulated,
virtio or passthrough. In the current use case of VFIO device-passthrough to
an SNP guest, the pass-through device will perform DMA to un-encrypted or
shared guest memory, in the same way as virtio or emulated devices.

This fix is prompted by an issue reported by Nvidia, they are trying to do
PCIe device passthrough to SNP guest. The memory allocated for DMA is
through dma_alloc_coherent() in the SNP guest and during DMA I/O an
RMP_PAGE_FAULT is observed on the host.

These dma_alloc_coherent() calls map into page state change hypercalls into
the host to change guest page state from encrypted to shared in the RMP
table.

Following is a link to issue discussed above:
https://github.com/AMDESE/AMDSEV/issues/109

Wow you should really write all of this in the commmit message

Now, to set individual 4K entries to different shared/private
mappings in NPT or host page tables for large page entries, the RMP
and NPT/host page table large page entries are split to 4K pte’s.

Why are mappings to private pages even in the iommu in the first
place - and how did they even get there?


You seem to be confusing between host/NPT page tables and IOMMU page tables.

There are no private page mappings in the IOMMU page tables, as i mentioned above currently all DMA to SNP guest is to/from shared memory.

I thought the design for the private memory was walling it off in a
memfd and making it un-gup'able?

This seems to be your actual problem, somehow the iommu is being
loaded with private memory PFNs instead of only being loaded with
shared PFNs when shared mappings are created?


The IOMMU page tables are loaded with shared PFNs and not private memory PFNs.

If the IOMMU mappings actually only extend to the legitimate shared
pages then you don't have a problem with large IOPTEs spanning a
mixture of page types.

The fix is to force 4K page size for IOMMU page tables for SNP guests.

But even if you want to persue this as the fix, it should not be done
in this way.

This patch-set adds support to detect if a domain belongs to an SNP-enabled
guest. This way it can set default page size of a domain to 4K only for
SNP-enabled guest and allow non-SNP guest to use larger page size.

As I said, the KVM has nothing to do with the iommu and I want to
laregly keep it that way.

If the VMM needs to request a 4k page size only iommu_domain because
it is somehow mapping mixtures of private and public pages,

Again, there is no mixture of private and public pages, the IOMMU only has shared page mappings.

Thanks,
Ashish

then the
VMM knows it is doing this crazy thing and it needs to ask iommufd
directly for customized iommu_domain from the driver.

No KVM interconnection.

In fact, we already have a way to do this in iommufd generically, have
the VMM set IOMMU_OPTION_HUGE_PAGES = 0.

Jason