Re: [PATCH] ext4: add a configurable parameter to prevent endless loop in ext4_mb_discard_group_p

From: Andreas Dilger
Date: Thu Apr 08 2021 - 18:11:42 EST


On Apr 8, 2021, at 12:50 PM, Wen Yang <wenyang@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi Ritesh and Andreas,
>
> Thank you for your reply. Since there is still a faulty machine, we have analyzed it again and found it is indeed a very special case:
>
>
> crash> struct ext4_group_info ffff8813bb5f72d0
> struct ext4_group_info {
> bb_state = 0,
> bb_free_root = {
> rb_node = 0x0
> },
> bb_first_free = 1681,
> bb_free = 0,
> bb_fragments = 0,
> bb_largest_free_order = -1,
> bb_prealloc_list = {
> next = 0xffff880268291d78,
> prev = 0xffff880268291d78 ---> *** The list is empty
> },
> alloc_sem = {
> count = {
> counter = 0
> },
> wait_list = {
> next = 0xffff8813bb5f7308,
> prev = 0xffff8813bb5f7308
> },
> wait_lock = {
> raw_lock = {
> {
> val = {
> counter = 0
> },
> {
> locked = 0 '\000',
> pending = 0 '\000'
> },
> {
> locked_pending = 0,
> tail = 0
> }
> }
> }
> },
> osq = {
> tail = {
> counter = 0
> }
> },
> owner = 0x0
> },
> bb_counters = 0xffff8813bb5f7328
> }
> crash>
>
>
> crash> list 0xffff880268291d78 -l ext4_prealloc_space.pa_group_list -s ext4_prealloc_space.pa_count
> ffff880268291d78
> pa_count = {
> counter = 1 ---> ****pa->pa_count
> }
> ffff8813bb5f72f0
> pa_count = {
> counter = -30701
> }
>
>
> crash> struct -xo ext4_prealloc_space
> struct ext4_prealloc_space {
> [0x0] struct list_head pa_inode_list;
> [0x10] struct list_head pa_group_list;
> union {
> struct list_head pa_tmp_list;
> struct callback_head pa_rcu;
> [0x20] } u;
> [0x30] spinlock_t pa_lock;
> [0x34] atomic_t pa_count;
> [0x38] unsigned int pa_deleted;
> [0x40] ext4_fsblk_t pa_pstart;
> [0x48] ext4_lblk_t pa_lstart;
> [0x4c] ext4_grpblk_t pa_len;
> [0x50] ext4_grpblk_t pa_free;
> [0x54] unsigned short pa_type;
> [0x58] spinlock_t *pa_obj_lock;
> [0x60] struct inode *pa_inode;
> }
> SIZE: 0x68
>
>
> Before "goto repeat", it is necessary to check whether grp->bb_prealloc_list is empty. This patch may fix it.
> Please kindly give us your comments and suggestions.
> Thanks.

This patch definitely looks more reasonable than the previous one, but
I don't think it is correct?

It isn't clear how the system could get into this state, or stay there.

If bb_prealloc_list is empty, then "busy" should be 0, since busy = 1
is only set inside "list_for_each_entry_safe(bb_prealloc_list)", and
that implies the list is *not* empty? The ext4_lock_group() is held
over this code, so the list should not be changing in this case, and
even if the list changed, it _should_ only affect the loop one time.

> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 99bf091..8082e2d 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4290,7 +4290,7 @@ static void ext4_mb_new_preallocation(struct ext4_allocation_context *ac)
> free_total += free;
>
> /* if we still need more blocks and some PAs were used, try again */
> - if (free_total < needed && busy) {
> + if (free_total < needed && busy && !list_empty(&grp->bb_prealloc_list)) {
> ext4_unlock_group(sb, group);
> cond_resched();
> busy = 0;
> goto repeat;

Minor suggestion - moving "busy = 0" right after "repeat:" and removing
the "int busy = 0" initializer would make this code a bit more clear.

Cheers, Andreas



Attachment: signature.asc
Description: Message signed with OpenPGP