Re: [PATCH v2] block: fix trace completion for chained bio

From: Edward Hsieh
Date: Fri Apr 23 2021 - 04:04:58 EST


On 3/23/2021 5:22 AM, NeilBrown wrote:
On Wed, Mar 03 2021, edwardh wrote:

From: Edward Hsieh <edwardh@xxxxxxxxxxxx>

For chained bio, trace_block_bio_complete in bio_endio is currently called
only by the parent bio once upon all chained bio completed.
However, the sector and size for the parent bio are modified in bio_split.
Therefore, the size and sector of the complete events might not match the
queue events in blktrace.

The original fix of bio completion trace <fbbaf700e7b1> ("block: trace
completion of all bios.") wants multiple complete events to correspond
to one queue event but missed this.

md/raid5 read with bio cross chunks can reproduce this issue.

To fix, move trace completion into the loop for every chained bio to call.

Thanks. I think this is correct as far as tracing goes.
However the code still looks a bit odd.

The comment for the handling of bio_chain_endio suggests that the *only*
purpose for that is to avoid deep recursion. That suggests it should be
at the end of the function.
As it is blk_throtl_bio_endio() and bio_unint() are only called on the
last bio in a chain.
That seems wrong.

I'd be more comfortable if the patch moved the bio_chain_endio()
handling to the end, after all of that.
So the function would end.

if (bio->bi_end_io == bio_chain_endio) {
bio = __bio_chain_endio(bio);
goto again;
} else if (bio->bi_end_io)
bio->bi_end_io(bio);

Jens: can you see any reason why that functions must only be called on
the last bio in the chain?

Thanks,
NeilBrown


Hi Neil and Jens,

From the commit message, bio_uninit is put here for bio allocated in special ways (e.g., on stack), that will not be release by bio_free. For chained bio, __bio_chain_endio invokes bio_put and release the resources, so it seems that we don't need to call bio_uninit for chained bio.

The blk_throtl_bio_endio is used to update the latency for the throttle group. I think the latency should only be updated after the whole bio is finished?

To make sense for the "tail call optimization" in the comment, I'll suggest to wrap the whole statement with an else. What do you think?

if (bio->bi_end_io == bio_chain_endio) {
bio = __bio_chain_endio(bio);
goto again;
} else {
blk_throtl_bio_endio(bio);
/* release cgroup info */
bio_uninit(bio);
if (bio->bi_end_io)
bio->bi_end_io(bio);
}

Thanks,
Edward Hsieh