RE: [PATCH 00/17] Add Intel VT-d nested translation

From: Shameerali Kolothum Thodi
Date: Thu Feb 09 2023 - 05:11:54 EST




> -----Original Message-----
> From: Yi Liu [mailto:yi.l.liu@xxxxxxxxx]
> Sent: 09 February 2023 04:32
> To: joro@xxxxxxxxxx; alex.williamson@xxxxxxxxxx; jgg@xxxxxxxxxx;
> kevin.tian@xxxxxxxxx; robin.murphy@xxxxxxx
> Cc: cohuck@xxxxxxxxxx; eric.auger@xxxxxxxxxx; nicolinc@xxxxxxxxxx;
> kvm@xxxxxxxxxxxxxxx; mjrosato@xxxxxxxxxxxxx;
> chao.p.peng@xxxxxxxxxxxxxxx; yi.l.liu@xxxxxxxxx; yi.y.sun@xxxxxxxxxxxxxxx;
> peterx@xxxxxxxxxx; jasowang@xxxxxxxxxx; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@xxxxxxxxxx>; lulu@xxxxxxxxxx;
> suravee.suthikulpanit@xxxxxxx; iommu@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linux-kselftest@xxxxxxxxxxxxxxx;
> baolu.lu@xxxxxxxxxxxxxxx
> Subject: [PATCH 00/17] Add Intel VT-d nested translation
>
> Nested translation has two stage address translations to get the final
> physical addresses. Take Intel VT-d as an example, the first stage translation
> structure is I/O page table. As the below diagram shows, guest I/O page
> table pointer in GPA (guest physical address) is passed to host to do the
> first stage translation. Along with it, guest modifications to present
> mappings in the first stage page should be followed with an iotlb invalidation
> to sync host iotlb.
>
> .-------------. .---------------------------.
> | vIOMMU | | Guest I/O page table |
> | | '---------------------------'
> .----------------/
> | PASID Entry |--- PASID cache flush --+
> '-------------' |
> | | V
> | | I/O page table pointer in GPA
> '-------------'
> Guest
> ------| Shadow |--------------------------|--------
> v v v
> Host
> .-------------. .------------------------.
> | pIOMMU | | FS for GIOVA->GPA |
> | | '------------------------'
> .----------------/ |
> | PASID Entry | V (Nested xlate)
> '----------------\.----------------------------------.
> | | | SS for GPA->HPA, unmanaged domain|
> | | '----------------------------------'
> '-------------'
> Where:
> - FS = First stage page tables
> - SS = Second stage page tables
> <Intel VT-d Nested translation>
>
> Different platform vendors have different first stage translation formats,
> so userspace should query the underlying iommu capability before setting
> first stage translation structures to host.[1]
>
> In iommufd subsystem, I/O page tables would be tracked by hw_pagetable
> objects.
> First stage page table is owned by userspace (guest), while second stage
> page
> table is owned by kernel for security. So First stage page tables are tracked
> by user-managed hw_pagetable, second stage page tables are tracked by
> kernel-
> managed hw_pagetable.
>
> This series first introduces new iommu op for allocating domains for
> iommufd,
> and op for syncing iotlb for first stage page table modifications, and then
> add the implementation of the new ops in intel-iommu driver. After this
> preparation, adds kernel-managed and user-managed hw_pagetable
> allocation for
> userspace. Last, add self-test for the new ioctls.
>
> This series is based on "[PATCH 0/6] iommufd: Add iommu capability
> reporting"[1]
> and Nicolin's "[PATCH v2 00/10] Add IO page table replacement support"[2].
> Complete
> code can be found in[3]. Draft Qemu code can be found in[4].
>
> Basic test done with DSA device on VT-d. Where the guest has a vIOMMU
> built
> with nested translation.

Hi Yi Liu,

Thanks for sending this out. Will go through this one. As I informed before we keep
an internal branch based on your work and rebase few patches to get the ARM
SMMUv3 nesting support. The recent one is based on your "iommufd-v6.2-rc4-nesting"
branch and is here,

https://github.com/hisilicon/kernel-dev/commits/iommufd-v6.2-rc4-nesting-arm

Just wondering any chance the latest "Add SMMUv3 nesting support" series will
be send out soon? Please let me know if you need any help with that.

Thanks,
Shameer
>
> [1]
> https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.
> com/
> [2]
> https://lore.kernel.org/linux-iommu/cover.1675802050.git.nicolinc@nvidia.c
> om/
> [3] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting_vtd_v1
> [4] https://github.com/yiliu1765/qemu/tree/wip/iommufd_rfcv3%2Bnesting
>
> Regards,
> Yi Liu
>
> Lu Baolu (5):
> iommu: Add new iommu op to create domains owned by userspace
> iommu: Add nested domain support
> iommu/vt-d: Extend dmar_domain to support nested domain
> iommu/vt-d: Add helper to setup pasid nested translation
> iommu/vt-d: Add nested domain support
>
> Nicolin Chen (6):
> iommufd: Add/del hwpt to IOAS at alloc/destroy()
> iommufd/device: Move IOAS attaching and detaching operations into
> helpers
> iommufd/selftest: Add IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE test
> op
> iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC ioctl
> iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
> iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
>
> Yi Liu (6):
> iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation
> iommufd: Split iommufd_hw_pagetable_alloc()
> iommufd: Add kernel-managed hw_pagetable allocation for userspace
> iommufd: Add infrastructure for user-managed hw_pagetable allocation
> iommufd: Add user-managed hw_pagetable allocation
> iommufd/device: Report supported stage-1 page table types
>
> drivers/iommu/intel/Makefile | 2 +-
> drivers/iommu/intel/iommu.c | 38 ++-
> drivers/iommu/intel/iommu.h | 50 +++-
> drivers/iommu/intel/nested.c | 143 +++++++++
> drivers/iommu/intel/pasid.c | 142 +++++++++
> drivers/iommu/intel/pasid.h | 2 +
> drivers/iommu/iommufd/device.c | 117 ++++----
> drivers/iommu/iommufd/hw_pagetable.c | 280
> +++++++++++++++++-
> drivers/iommu/iommufd/iommufd_private.h | 23 +-
> drivers/iommu/iommufd/iommufd_test.h | 35 +++
> drivers/iommu/iommufd/main.c | 11 +
> drivers/iommu/iommufd/selftest.c | 149 +++++++++-
> include/linux/iommu.h | 11 +
> include/uapi/linux/iommufd.h | 196 ++++++++++++
> tools/testing/selftests/iommu/iommufd.c | 124 +++++++-
> tools/testing/selftests/iommu/iommufd_utils.h | 106 +++++++
> 16 files changed, 1329 insertions(+), 100 deletions(-)
> create mode 100644 drivers/iommu/intel/nested.c
>
> --
> 2.34.1
>