RE: [PATCH 5/5 v11] iommu/fsl: Freescale PAMU driver and iommuimplementation.

From: Sethi Varun-B16395
Date: Wed Apr 03 2013 - 01:12:24 EST




> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Wednesday, April 03, 2013 7:23 AM
> To: Timur Tabi
> Cc: Joerg Roedel; Sethi Varun-B16395; lkml; Kumar Gala; Yoder Stuart-
> B08248; iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; Benjamin Herrenschmidt;
> linuxppc-dev@xxxxxxxxxxxxxxxx
> Subject: Re: [PATCH 5/5 v11] iommu/fsl: Freescale PAMU driver and iommu
> implementation.
>
> On 04/02/2013 08:35:54 PM, Timur Tabi wrote:
> > On Tue, Apr 2, 2013 at 11:18 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
> >
> > > > + panic("\n");
> > >
> > > A kernel panic seems like an over-reaction to an access violation.
> >
> > We have no way to determining what code caused the violation, so we
> > can't just kill the process. I agree it seems like overkill, but what
> > else should we do? Does the IOMMU layer have a way for the IOMMU
> > driver to stop the device that caused the problem?
>
> At a minimum, log a message and continue. Probably turn off the LIODN,
> at least if it continues to be noisy (otherwise we could get stuck in an
> interrupt storm as you note). Possibly let the user know somehow,
> especially if it's a VFIO domain.
[Sethi Varun-B16395] Can definitely log the message and disable the LIODN (to avoid an interrupt storm), but
we definitely need a mechanism to inform vfio subsystem about the error. Also, disabling LIODN may not be a viable
option with the new LIODN allocation scheme (where LIODN would be associated with a domain).

>
> Don't take down the whole kernel. It's not just overkill; it undermines
> VFIO's efforts to make it safe for users to control devices.
>
> > > Besides the device that caused the violation the system should still
> > > work, no?
> >
> > Not really. The PAMU was designed to add IOMMU support to legacy
> > devices, which have no concept of an MMU. If the PAMU detects an
> > access violation, there's no way for the device to recover, because it
> > has no idea that a violation has occurred. It's going to keep on
> > writing to bad data.
>
> I think that's only the case for posted writes (or devices which fail to
> take a hint and stop even after they see an I/O error).
>
[Sethi Varun-B16395] Even in the case where the guest driver detects a failure, it may not be able to fix the problem without intervention from the VFIO subsystem.

-Varun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/