Re: [Xen-devel] [PATCH 1/1] xen-blkback: stop blkback thread of every queue in xen_blkif_disconnect

From: Roger Pau Monné
Date: Fri Aug 18 2017 - 13:56:19 EST


On Fri, Aug 18, 2017 at 10:29:15AM -0400, annie li wrote:
>
> On 8/18/2017 5:14 AM, Roger Pau Monné wrote:
> > On Thu, Aug 17, 2017 at 06:43:46PM -0400, Annie Li wrote:
> > > If there is inflight I/O in any non-last queue, blkback returns -EBUSY
> > > directly, and never stops thread of remaining queue and processs them. When
> > > removing vbd device with lots of disk I/O load, some queues with inflight
> > > I/O still have blkback thread running even though the corresponding vbd
> > > device or guest is gone.
> > > And this could cause some problems, for example, if the backend device type
> > > is file, some loop devices and blkback thread always lingers there forever
> > > after guest is destroyed, and this causes failure of umounting repositories
> > > unless rebooting the dom0. So stop all threads properly and return -EBUSY
> > > if any queue has inflight I/O.
> > >
> > > Signed-off-by: Annie Li <annie.li@xxxxxxxxxx>
> > > Reviewed-by: Herbert van den Bergh <herbert.van.den.bergh@xxxxxxxxxx>
> > > Reviewed-by: Bhavesh Davda <bhavesh.davda@xxxxxxxxxx>
> > > Reviewed-by: Adnan Misherfi <adnan.misherfi@xxxxxxxxxx>
> > > ---
> > > drivers/block/xen-blkback/xenbus.c | 10 ++++++++--
> > > 1 file changed, 8 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> > > index 792da68..2adb859 100644
> > > --- a/drivers/block/xen-blkback/xenbus.c
> > > +++ b/drivers/block/xen-blkback/xenbus.c
> > > @@ -244,6 +244,7 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
> > > {
> > > struct pending_req *req, *n;
> > > unsigned int j, r;
> > > + bool busy = false;
> > > for (r = 0; r < blkif->nr_rings; r++) {
> > > struct xen_blkif_ring *ring = &blkif->rings[r];
> > > @@ -261,8 +262,10 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
> > > * don't have any discard_io or other_io requests. So, checking
> > > * for inflight IO is enough.
> > > */
> > > - if (atomic_read(&ring->inflight) > 0)
> > > - return -EBUSY;
> > > + if (atomic_read(&ring->inflight) > 0) {
> > > + busy = true;
> > > + continue;
> > > + }
> > I guess I'm missing something, but I don't see how this is solving the
> > problem described in the description.
> >
> > If the problem is that xen_blkif_disconnect returns without cleaning
> > all the queues, this patch keeps the current behavior, just that it
> > will try to remove more queues before returning, as opposed to
> > returning when finding the first busy queue.
> Before checking inflight, following code stops the blkback thread,
> if (ring->xenblkd) {
> kthread_stop(ring->xenblkd);
> wake_up(&ring->shutdown_wq);
> }
> This patch allows thread of every queue has the chance to get stopped.
> Otherwise, only thread of queue before(including) first busy one get
> stopped, threads of remaining queue will still run, and these blkthread and
> corresponding loop device will linger forever even after guest is destroyed.

Thanks for the explanation:

Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Roger.