Re: [PATCH v3 2/7] iommu/riscv: Add RISC-V IOMMU platform device driver

From: Tomasz Jeznach
Date: Wed May 01 2024 - 22:45:02 EST


On Wed, May 1, 2024 at 7:25 PM Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
>
> On 5/1/24 10:20 PM, Jason Gunthorpe wrote:
> > On Wed, May 01, 2024 at 06:26:20PM +0800, Baolu Lu wrote:
> >> On 2024/5/1 4:01, Tomasz Jeznach wrote:
> >>> +static int riscv_iommu_init_check(struct riscv_iommu_device *iommu)
> >>> +{
> >>> + u64 ddtp;
> >>> +
> >>> + /*
> >>> + * Make sure the IOMMU is switched off or in pass-through mode during regular
> >>> + * boot flow and disable translation when we boot into a kexec kernel and the
> >>> + * previous kernel left them enabled.
> >>> + */
> >>> + ddtp = riscv_iommu_readq(iommu, RISCV_IOMMU_REG_DDTP);
> >>> + if (ddtp & RISCV_IOMMU_DDTP_BUSY)
> >>> + return -EBUSY;
> >>> +
> >>> + if (FIELD_GET(RISCV_IOMMU_DDTP_MODE, ddtp) > RISCV_IOMMU_DDTP_MODE_BARE) {
> >>> + if (!is_kdump_kernel())
> >> Is kdump supported for RISC-V architectures? If so, the documentation
> >> in Documentation/admin-guide/kdump/kdump.rst might need an update.
> >>
> >> There is a possibility of ongoing DMAs during the boot process of the
> >> kdump capture kernel because there's a small chance of legacy DMA setups
> >> targeting any memory location. Kdump typically allows these ongoing DMA
> >> transfers to complete, assuming they were intended for valid memory
> >> regions.
> >>
> >> The IOMMU subsystem implements a default domain deferred attachment
> >> mechanism for this. In the kdump capture kernel, the whole device
> >> context tables are copied from the original kernel and will be
> >> overridden once the device driver calls the kernel DMA interface for the
> >> first time. This assumes that all old DMA transfers are completed after
> >> the driver's takeover.
> >>
> >> Will you consider this for RISC-V architecture as well?
> > It seems we decided not to do that mess in ARM..
> >
> > New architectures doing kdump should put the iommu in a full blocking
> > state before handing over the next kernel, and this implies that
> > devices drivers need to cleanly suspend their DMAs before going into
> > the next kernel.
>
> Glad to hear that. :-)
>
> With the above consideration, the driver should consider it an error
> case where the iommu is not in the blocking state, and it's in the kdump
> kernel, right?
>
> If so, probably the iommu driver should always return failure when the
> iommu is not in the blocking state. However, the RISC-V's logic is:
>
> - if this is a kdump kernel, just disable iommu;
> - otherwise, failure case.
>
> This logic seems problematic.
>

Initial implementation recognize this as an error and failed to
initialize IOMMU was in non-default state (disabled or pass-through).
Ideally we should have proper shutdown sequence for kdump case, and
quiesce IOMMU before kdump kernel runs. However, this path is
problematic enough that I'd prefer not to add any other complications
to this patch set. Definitely setting IOMMU in a full blocking state
before handling over to kdump kernel. Will get back to that at some
point.

Dropping all IOMMU config in kdump path is IMHO the compromise we can
accept, leaving kdump kernel an option to reconfigure IOMMU in later
stages of the IOMMU initialization.

> Best regards,
> baolu

Best,
- Tomasz