Re: [PATCH] block: Check that queue is alive in blk_insert_cloned_request()

From: Mike Snitzer
Date: Mon Jul 11 2011 - 18:40:37 EST


[cc'ing dm-devel, vivek and tejun]

On Fri, Jul 8, 2011 at 7:04 PM, Roland Dreier <roland@xxxxxxxxxx> wrote:
> From: Roland Dreier <roland@xxxxxxxxxxxxxxx>
>
> This fixes crashes such as the below that I see when the storage
> underlying a dm-multipath device is hot-removed.  The problem is that
> dm requeues a request to a device whose block queue has already been
> cleaned up, and blk_insert_cloned_request() doesn't check if the queue
> is alive, but rather goes ahead and tries to queue the request.  This
> ends up dereferencing the elevator that was already freed in
> blk_cleanup_queue().

Your patch looks fine to me:
Acked-by: Mike Snitzer <snitzer@xxxxxxxxxx>

And I looked at various code paths to arrive at the references DM takes.

A reference is taken on the underlying devices' block_device via
drivers/md/dm-table.c:open_dev() with blkdev_get_by_dev(). open_dev()
also does bd_link_disk_holder(), resulting in the mpath device
becoming a holder of the underlying devices. e.g.:
/sys/block/sda/holders/dm-4

But at no point does DM-mpath get a reference to the underlying
devices' request_queue that gets assigned to clone->q (in
drivers/md/dm-mpath.c:map_io).

Seems we should, though AFAIK it won't help with the issue you've
pointed out (because the hotplugged device's driver already called
blk_cleanup_queue and nuked the elevator).

So I'm not sure getting the request_queue reference actually fixes
anything (maybe my imagination is lacking?). But getting the
request_queue reference is "the right thing" if we're going to be
setting pointers to it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/