Re: [Drbd-dev] FLUSH/FUA documentation & code discrepancy

From: Tejun Heo
Date: Mon Sep 10 2012 - 18:54:40 EST

Hello, Lars.

On Fri, Sep 07, 2012 at 10:42:21AM +0200, Lars Ellenberg wrote:
> We have a kernel thread that is receiving data blocks,
> and some "boundary" information (in the sense that between such
> boundaries, we have a reorder domain, where requests may reorder freely,
> but no requests may be reordered across such boundaries).

What purpose does this boundary serve? Why is it necessary? Which
driver is this?

> This same thread submits the assembled bios.
> With the old, stronger, BIO_RW_BARRIER implementation,
> if it was supported, we could just submit the first bio of a reorder
> domain (plus some special cases) with that flag,
> and could keep receiving -> assembling -> submitting.

Yes, but the actual request processing would continue to stall as
block layer would have been draining requests continuously.

> Now, we assumed that with FLUSH/FUA, we can do the same.
> And we could, as long as it is supported through the whole stack.
> But if it is not supported at some level in the stack, we must first drain.
> And since it is all "transparent", we just cannot determine
> if the whole stack does or does not support it.
> So we have to drain always.

The driver was hitching on BARRIER for draining. As that's gone now,
if you want the same behavior, the driver would need to drain itself.

> We did not realize that.
> In certain cases, where we submitted in the right order, and even
> indicated what we thought would amount to at least a "soft barrier"
> (reorder boundary) for the elevator, we ended up with data corruption
> because the elevator never sees these indicators, and reorders.
> Fine, our mistake/misunderstanding of the drain requirement.
> That's fixed now, we do always drain
> (unless specifically configured not to, where the admin takes the blame
> if that does not work on his stack).
> To always drain is also a performance hit, as we would rather keep
> receiving data and assembling bios and submitting them.

Is the performance hit measureable? Block BARRIER support had some
optimizations but it still had to constantly drain all the same.

> We can possibly work around that by introducing an additional submitter thread,
> or at least our own list where we queue assembled bios until the lower
> level device queue drains.
> But we'd rather have the elevator see the FLUSH/FUA,
> and treat them as at least a soft barrier/reorder boundary.
> I may be wrong here, but all the necessary bits for this seem to be in
> place already, if the information would even reach the elevator in one
> way or other, and not be completely stripped away early.
> What would you rather see, the elevator recognizing reorder boundaries?
> Or additional higher level queueing and extra thread/work queue/whatever?
> Both are fine with me, I'm just asking for an opinion.

First of all, using FLUSH/FUA for such purpose is an error-prone
abuse. You're trying to exploit an implementation detail which may
change at any time. I think what you want is to be able to specify
REQ_SOFTBARRIER on bio submission, which shouldn't be too hard but I'm
still lost why this is necessary. Can you please explain it a bit


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at