Re: [PATCH 0/9] More virtio hardening

From: Michael S. Tsirkin
Date: Mon Oct 11 2021 - 08:36:30 EST


On Mon, Oct 11, 2021 at 03:36:51PM +0800, Jason Wang wrote:
> On Tue, Oct 5, 2021 at 3:42 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> >
> > On Mon, Sep 13, 2021 at 01:53:44PM +0800, Jason Wang wrote:
> > > Hi All:
> > >
> > > This series treis to do more hardening for virito.
> > >
> > > patch 1 validates the num_queues for virio-blk device.
> > > patch 2-4 validates max_nr_ports for virito-console device.
> > > patch 5-7 harden virtio-pci interrupts to make sure no exepcted
> > > interrupt handler is tiggered. If this makes sense we can do similar
> > > things in other transport drivers.
> > > patch 8-9 validate used ring length.
> > >
> > > Smoking test on blk/net with packed=on/off and iommu_platform=on/off.
> > >
> > > Please review.
> > >
> > > Thanks
> >
> > So I poked at console at least, and I think I see
> > an issue: if interrupt handler queues a work/bh,
> > then it can still run while reset is in progress.
>
> Looks like a bug which is unrelated to the hardening?

Won't preventing use after free be relevant?
I frankly don't know what does hardening means then.
> E.g the driver
> should sync with work/bh before reset.

No, there's no way to fix it ATM without extra locks and state which I
think we should strive to avoid or make it generic, not per-driver,
since sync before reset is useless, new interrupts will just arrive and
queue more work. And a sync after reset is too late since driver will
try to add buffers.

Maybe we can break device. Two issues with that
- drivers might not be ready to handle add_buf failures
- restore needs to unbreak then and we don't have a way to do that yet

So .. careful reading of all device drivers and hoping we don't mess
things up even more ... here we come.

> >
> > I sent a patch to fix it for console removal specifically,
> > but I suspect it's not enough e.g. freeze is still broken.
> > And note this has been reported without any TDX things -
> > it's not a malicious device issue, can be triggered just
> > by module unload.
> >
> > I am vaguely thinking about new APIs to disable/enable callbacks.
> > An alternative:
> >
> > 1. adding new remove_nocb/freeze_nocb calls
> > 2. disabling/enabling interrupts automatically around these
> > 3. gradually moving devices to using these
> > 4. once/if all device move, removing the old callbacks
> >
> > the advantage here is that we'll be sure calls are always
> > paired correctly.
>
> I'm not sure I get the idea, but my feeling is that it doesn't
> conflict with the interrupt hardening here (or at least the same
> method is required e.g NO_AUTO_EN).
>
> Thanks

Right. It's not that it conflicts, it's that I was hoping that
since you are working on hardening you can take up fixing that.
Let me know whether you have the time. Thanks!

> >
> >
> >
> >
> >
> > > Jason Wang (9):
> > > virtio-blk: validate num_queues during probe
> > > virtio: add doc for validate() method
> > > virtio-console: switch to use .validate()
> > > virtio_console: validate max_nr_ports before trying to use it
> > > virtio_config: introduce a new ready method
> > > virtio_pci: harden MSI-X interrupts
> > > virtio-pci: harden INTX interrupts
> > > virtio_ring: fix typos in vring_desc_extra
> > > virtio_ring: validate used buffer length
> > >
> > > drivers/block/virtio_blk.c | 3 +-
> > > drivers/char/virtio_console.c | 51 +++++++++++++++++++++---------
> > > drivers/virtio/virtio_pci_common.c | 43 +++++++++++++++++++++----
> > > drivers/virtio/virtio_pci_common.h | 7 ++--
> > > drivers/virtio/virtio_pci_legacy.c | 5 +--
> > > drivers/virtio/virtio_pci_modern.c | 6 ++--
> > > drivers/virtio/virtio_ring.c | 27 ++++++++++++++--
> > > include/linux/virtio.h | 1 +
> > > include/linux/virtio_config.h | 6 ++++
> > > 9 files changed, 118 insertions(+), 31 deletions(-)
> > >
> > > --
> > > 2.25.1
> >