[PATCH 5.15 117/179] erofs: fix deadlock when shrink erofs slab

From: Greg Kroah-Hartman
Date: Mon Nov 29 2021 - 13:43:04 EST


From: Huang Jianan <huangjianan@xxxxxxxx>

[ Upstream commit 57bbeacdbee72a54eb97d56b876cf9c94059fc34 ]

We observed the following deadlock in the stress test under low
memory scenario:

Thread A Thread B
- erofs_shrink_scan
- erofs_try_to_release_workgroup
- erofs_workgroup_try_to_freeze -- A
- z_erofs_do_read_page
- z_erofs_collection_begin
- z_erofs_register_collection
- erofs_insert_workgroup
- xa_lock(&sbi->managed_pslots) -- B
- erofs_workgroup_get
- erofs_wait_on_workgroup_freezed -- A
- xa_erase
- xa_lock(&sbi->managed_pslots) -- B

To fix this, it needs to hold xa_lock before freezing the workgroup
since xarray will be touched then. So let's hold the lock before
accessing each workgroup, just like what we did with the radix tree
before.

[ Gao Xiang: Jianhua Hao also reports this issue at
https://lore.kernel.org/r/b10b85df30694bac8aadfe43537c897a@xxxxxxxxxx ]

Link: https://lore.kernel.org/r/20211118135844.3559-1-huangjianan@xxxxxxxx
Fixes: 64094a04414f ("erofs: convert workstn to XArray")
Reviewed-by: Chao Yu <chao@xxxxxxxxxx>
Reviewed-by: Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx>
Signed-off-by: Huang Jianan <huangjianan@xxxxxxxx>
Reported-by: Jianhua Hao <haojianhua1@xxxxxxxxxx>
Signed-off-by: Gao Xiang <xiang@xxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
fs/erofs/utils.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index bd86067a63f7f..3ca703cd5b24a 100644
--- a/fs/erofs/utils.c
+++ b/fs/erofs/utils.c
@@ -141,7 +141,7 @@ static bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
* however in order to avoid some race conditions, add a
* DBG_BUGON to observe this in advance.
*/
- DBG_BUGON(xa_erase(&sbi->managed_pslots, grp->index) != grp);
+ DBG_BUGON(__xa_erase(&sbi->managed_pslots, grp->index) != grp);

/* last refcount should be connected with its managed pslot. */
erofs_workgroup_unfreeze(grp, 0);
@@ -156,15 +156,19 @@ static unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
unsigned int freed = 0;
unsigned long index;

+ xa_lock(&sbi->managed_pslots);
xa_for_each(&sbi->managed_pslots, index, grp) {
/* try to shrink each valid workgroup */
if (!erofs_try_to_release_workgroup(sbi, grp))
continue;
+ xa_unlock(&sbi->managed_pslots);

++freed;
if (!--nr_shrink)
- break;
+ return freed;
+ xa_lock(&sbi->managed_pslots);
}
+ xa_unlock(&sbi->managed_pslots);
return freed;
}

--
2.33.0