Re: [PATCH] NVMe: do not touch sq door bell if nvmeq has been suspended
From: Keith Busch
Date: Tue Feb 02 2016 - 09:47:14 EST
On Tue, Feb 02, 2016 at 04:33:10PM +0200, Sagi Grimberg wrote:
> Hey Keith,
>
> >>First of all, I think we need to cancel all
> >>inflight requests before nvme_dev_unmap.
> >
> >IO cancelling is where it is because it protects against host memory
> >corruption. If you're going to mess with the ordering, just make sure
> >the PCI device is disabled from bus mastering first.
>
> Little help? :)
>
> What corruption is the ordering protecting against?
Sure thing. :)
We free the transfer buffers when a command is cancelled. The controller,
however, may still own the command and may try to write to them. We
have to fence the controller off from being able to do that, so we can't
cancel inflight commands while the PCI device is still bus master enabled.
In a perfect world, we could trust in disabling with NVMe registers,
but sometimes we can't rely on that.
This was commit 07836e659c81ec6b0d683dfbf7958339a22a7b69, which might
explain the scenario a little better, and was reported by end user.