Re: [PATCH] block: flush writeback dwork before detaching a bdev inode from it

From: Dmitry Vyukov
Date: Tue Jun 21 2016 - 09:05:34 EST


On Mon, Jun 20, 2016 at 7:40 PM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> Hello,
>
> On Mon, Jun 20, 2016 at 03:38:41PM +0200, Dmitry Vyukov wrote:
>> > Sorry for the late reply but now when thinking about the patch I don't
>> > think it is quite right. Writeback can happen from other contexts than just
>> > the worker one (e.g. kswapd can do writeback, or in some out-of-memory
>> > situations we may punt to doing writeback directly instead of calling the
>> > worker, or sync(2) calls fdatawrite() for block device inode directly when
>> > iterating through blockdev superblock). So flushing the workqueue IMHO is
>> > not covering 100% of cases.
>
> Hmmm, yeah, the patch undoes what the cgroup writeback changes added
> but it looks like the addition was exposing the existing problem more
> rather than causing a new one.
>
>> > I wanted to suggest to use inode_wait_for_writeback() which will make sure
>> > writeback is done with the inode. However we effectively already do this
>> > by calling bdev_write_inode() which calls writeback_single_inode() in
>> > WB_SYNC_ALL mode. So by the time that call completes we are sure writeback
>> > code is not looking at the inode. *However* what I think is happening is
>> > that sync_blockdev() writes all the dirty pages, bdev_write_inode() writes
>> > the inode and clears all dirty bits, however the inode still stays in the
>> > b_dirty / b_io list of the wb because it has I_DIRTY_TIME set. Subsequently
>> > once flusher work runs, it finds the inode, looks at it and boom. And the
>> > problem seems to be that write_inode_now(inode, true) does not result in
>> > I_DIRTY_TIME being cleared.
>> >
>> > Attached patch should fix this issue - it is compile-tested only. Dmitry,
>> > can you check whether this patch fixes the issue for you as well?
>>
>> I can't directly test it because crash happened very infrequently.
>> If Tejun/Ted agree that it is the right way to fix it, then I can
>> patch it, restart the fuzzer and leave it running for a while.
>
> Yes, please try out the patch.

Done
At least it should catch any new introduced bugs.