Re: [RFC PATCH V2 0/7] Do not read from descripto ring

From: Michael S. Tsirkin
Date: Thu May 06 2021 - 04:12:38 EST


On Thu, May 06, 2021 at 11:20:30AM +0800, Jason Wang wrote:
>
> 在 2021/4/23 下午4:09, Jason Wang 写道:
> > Hi:
> >
> > Sometimes, the driver doesn't trust the device. This is usually
> > happens for the encrtpyed VM or VDUSE[1]. In both cases, technology
> > like swiotlb is used to prevent the poking/mangling of memory from the
> > device. But this is not sufficient since current virtio driver may
> > trust what is stored in the descriptor table (coherent mapping) for
> > performing the DMA operations like unmap and bounce so the device may
> > choose to utilize the behaviour of swiotlb to perform attacks[2].
> >
> > To protect from a malicous device, this series store and use the
> > descriptor metadata in an auxiliay structure which can not be accessed
> > via swiotlb instead of the ones in the descriptor table. This means
> > the descriptor table is write-only from the view of the driver.
> >
> > Actually, we've almost achieved that through packed virtqueue and we
> > just need to fix a corner case of handling mapping errors. For split
> > virtqueue we just follow what's done in the packed.
> >
> > Note that we don't duplicate descriptor medata for indirect
> > descriptors since it uses stream mapping which is read only so it's
> > safe if the metadata of non-indirect descriptors are correct.
> >
> > For split virtqueue, the change increase the footprint due the the
> > auxiliary metadata but it's almost neglectlable in the simple test
> > like pktgen or netpef.
> >
> > Slightly tested with packed on/off, iommu on/of, swiotlb force/off in
> > the guest.
> >
> > Please review.
> >
> > Changes from V1:
> > - Always use auxiliary metadata for split virtqueue
> > - Don't read from descripto when detaching indirect descriptor
>
>
> Hi Michael:
>
> Our QE see no regression on the perf test for 10G but some regressions
> (5%-10%) on 40G card.
>
> I think this is expected since we increase the footprint, are you OK with
> this and we can try to optimize on top or you have other ideas?
>
> Thanks

Let's try for just a bit, won't make this window anyway:

I have an old idea. Add a way to find out that unmap is a nop
(or more exactly does not use the address/length).
Then in that case even with DMA API we do not need
the extra data. Hmm?


>
> >
> > [1]
> > https://lore.kernel.org/netdev/fab615ce-5e13-a3b3-3715-a4203b4ab010@xxxxxxxxxx/T/
> > [2]
> > https://yhbt.net/lore/all/c3629a27-3590-1d9f-211b-c0b7be152b32@xxxxxxxxxx/T/#mc6b6e2343cbeffca68ca7a97e0f473aaa871c95b
> >
> > Jason Wang (7):
> > virtio-ring: maintain next in extra state for packed virtqueue
> > virtio_ring: rename vring_desc_extra_packed
> > virtio-ring: factor out desc_extra allocation
> > virtio_ring: secure handling of mapping errors
> > virtio_ring: introduce virtqueue_desc_add_split()
> > virtio: use err label in __vring_new_virtqueue()
> > virtio-ring: store DMA metadata in desc_extra for split virtqueue
> >
> > drivers/virtio/virtio_ring.c | 201 +++++++++++++++++++++++++----------
> > 1 file changed, 144 insertions(+), 57 deletions(-)
> >