Re: blk: improve order of bio handling in generic_make_request()

From: Mike Snitzer
Date: Tue Mar 07 2017 - 19:57:25 EST


On Tue, Mar 07 2017 at 3:29pm -0500,
NeilBrown <neilb@xxxxxxxx> wrote:

> On Tue, Mar 07 2017, Mike Snitzer wrote:
>
> > On Tue, Mar 07 2017 at 12:05pm -0500,
> > Jens Axboe <axboe@xxxxxxxxx> wrote:
> >
> >> On 03/07/2017 09:52 AM, Mike Snitzer wrote:
> >> >
> >> > In addition to Jack's MD raid test there is a DM snapshot deadlock test,
> >> > albeit unpolished/needy to get running, see:
> >> > https://www.redhat.com/archives/dm-devel/2017-January/msg00064.html
> >>
> >> Can you run this patch with that test, reverting your DM workaround?
> >
> > Yeap, will do. Last time Mikulas tried a similar patch it still
> > deadlocked. But I'll give it a go (likely tomorrow).
>
> I don't think this will fix the DM snapshot deadlock by itself.
> Rather, it make it possible for some internal changes to DM to fix it.
> The DM change might be something vaguely like:
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 3086da5664f3..06ee0960e415 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1216,6 +1216,14 @@ static int __split_and_process_non_flush(struct clone_info *ci)
>
> len = min_t(sector_t, max_io_len(ci->sector, ti), ci->sector_count);
>
> + if (len < ci->sector_count) {
> + struct bio *split = bio_split(bio, len, GFP_NOIO, fs_bio_set);
> + bio_chain(split, bio);
> + generic_make_request(bio);
> + bio = split;
> + ci->sector_count = len;
> + }
> +
> r = __clone_and_map_data_bio(ci, ti, ci->sector, &len);
> if (r < 0)
> return r;
>
> Instead of looping inside DM, this change causes the remainder to be
> passed to generic_make_request() and DM only handles or region at a
> time. So there is only one loop, in the top generic_make_request().
> That loop will not reliable handle bios in the "right" order.

s/not reliable/now reliably/ ? ;)

But thanks for the suggestion Neil. Will dig in once I get through a
backlog of other DM target code I have queued for 4.12 review.

Mike