Re: [PATCH] virtio_mem: break device on remove

From: Michael S. Tsirkin
Date: Mon Jan 17 2022 - 16:44:59 EST


On Mon, Jan 17, 2022 at 11:25:12AM +0100, David Hildenbrand wrote:
> On 17.01.22 09:40, Michael S. Tsirkin wrote:
> > On Mon, Jan 17, 2022 at 09:31:56AM +0100, David Hildenbrand wrote:
> >> On 17.01.22 08:55, Michael S. Tsirkin wrote:
> >>> On Mon, Jan 17, 2022 at 02:40:11PM +0800, Jason Wang wrote:
> >>>>
> >>>> 在 2022/1/15 上午5:43, Michael S. Tsirkin 写道:
> >>>>> A common pattern for device reset is currently:
> >>>>> vdev->config->reset(vdev);
> >>>>> .. cleanup ..
> >>>>>
> >>>>> reset prevents new interrupts from arriving and waits for interrupt
> >>>>> handlers to finish.
> >>>>>
> >>>>> However if - as is common - the handler queues a work request which is
> >>>>> flushed during the cleanup stage, we have code adding buffers / trying
> >>>>> to get buffers while device is reset. Not good.
> >>>>>
> >>>>> This was reproduced by running
> >>>>> modprobe virtio_console
> >>>>> modprobe -r virtio_console
> >>>>> in a loop, and this reasoning seems to apply to virtio mem though
> >>>>> I could not reproduce it there.
> >>>>>
> >>>>> Fix this up by calling virtio_break_device + flush before reset.
> >>>>>
> >>>>> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> >>>>> ---
> >>>>> drivers/virtio/virtio_mem.c | 2 ++
> >>>>> 1 file changed, 2 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> >>>>> index 38becd8d578c..33b8a118a3ae 100644
> >>>>> --- a/drivers/virtio/virtio_mem.c
> >>>>> +++ b/drivers/virtio/virtio_mem.c
> >>>>> @@ -2888,6 +2888,8 @@ static void virtio_mem_remove(struct virtio_device *vdev)
> >>>>> virtio_mem_deinit_hotplug(vm);
> >>>>> /* reset the device and cleanup the queues */
> >>>>> + virtio_break_device(vdev);
> >>>>> + flush_work(&vm->wq);
> >>>>
> >>>>
> >>>> We set vm->removing to true and call cancel_work_sync() in
> >>>> virtio_mem_deinit_hotplug(). Isn't is sufficient?
> >>>>
> >>>> Thanks
> >>>
> >>>
> >>> Hmm I think you are right. David, I will drop this for now.
> >>> Up to you to consider whether some central capability will be
> >>> helpful as a replacement for the virtio-mem specific "removing" flag.
> >>
> >> It's all a bit tricky because we also have to handle pending timers and
> >> pending memory onlining/offlining operations in a controlled way. Maybe
> >> we could convert to virtio_break_device() and use the
> >> &dev->vqs_list_lock as a replacement for the removal_lock. However, I'm
> >> not 100% sure if it's nice to use that lock from
> >> drivers/virtio/virtio_mem.c directly.
> >
> > We could add an API if you like. Or maybe it makes sense to add a
> > separate one that lets you find out that removal started. Need to figure
> > out how to handle suspend too then ...
> > Generally we have these checks that device is not going away
> > sprinkled over all drivers and I don't like it, but
> > it's not easy to build a sane API to handle it, especially
> > for high speed things when we can't take locks because performance.
>
> The interesting case might be how to handle virtio_mem_retry(), whereby
> we conditionally queue work if !removing.
>
> Having that said, in an ideal world we could deny removal requests for
> virtio_mem completely if there is still any memory added to the system ...
>
>
> --
> Thanks,
>
> David / dhildenb


removal requests might come from guest admin.

--
MST