Re: [PATCH AUTOSEL 5.4 299/350] btrfs: don't prematurely free work in end_workqueue_fn()

From: David Sterba
Date: Thu Dec 12 2019 - 07:11:06 EST


On Tue, Dec 10, 2019 at 04:06:44PM -0500, Sasha Levin wrote:
> From: Omar Sandoval <osandov@xxxxxx>
>
> [ Upstream commit 9be490f1e15c34193b1aae17da58e14dd9f55a95 ]
>
> Currently, end_workqueue_fn() frees the end_io_wq entry (which embeds
> the work item) and then calls bio_endio(). This is another potential
> instance of the bug in "btrfs: don't prematurely free work in
> run_ordered_work()".
>
> In particular, the endio call may depend on other work items. For
> example, btrfs_end_dio_bio() can call btrfs_subio_endio_read() ->
> __btrfs_correct_data_nocsum() -> dio_read_error() ->
> submit_dio_repair_bio(), which submits a bio that is also completed
> through a end_workqueue_fn() work item. However,
> __btrfs_correct_data_nocsum() waits for the newly submitted bio to
> complete, thus it depends on another work item.
>
> This example currently usually works because we use different workqueue
> helper functions for BTRFS_WQ_ENDIO_DATA and BTRFS_WQ_ENDIO_DIO_REPAIR.
> However, it may deadlock with stacked filesystems and is fragile
> overall. The proper fix is to free the work item at the very end of the
> work function, so let's do that.
>
> Reviewed-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
> Signed-off-by: Omar Sandoval <osandov@xxxxxx>
> Reviewed-by: David Sterba <dsterba@xxxxxxxx>
> Signed-off-by: David Sterba <dsterba@xxxxxxxx>
> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

The were more patches in the series, all contain "don't prematurely free
work in" and were part of a rework of async work processing. They're
fixing a very uncommon usecase, so if there's desire to backport them
the whole series needs to go in.

In the autosel list, there are only 2 and without the important fix.

c495dcd6fbe1 btrfs: don't prematurely free work in run_ordered_work()
9be490f1e15c btrfs: don't prematurely free work in end_workqueue_fn()
e732fe95e4ca btrfs: don't prematurely free work in reada_start_machine_worker()
57d4f0b86327 btrfs: don't prematurely free work in scrub_missing_raid56_worker()

a0cac0ec961f btrfs: get rid of unique workqueue helper functions
- this is only a cleanup that removes code obsoleted by the fixes above,
probably out of scope of stable

I have intentionally not tagged the patches for stable, the usecase is
is specific to one user (FB), the known reproducer is only their
workload and the fixes are in their kernel already.

So if there's desire to add the patches to stable trees, then it has to
be the whole series, but I don't see a strong reason for it.