Re: [PATCH 2/2] ext4: correct the judgment of BUG in ext4_mb_normalize_request

From: Jan Kara
Date: Tue May 24 2022 - 05:30:40 EST


On Mon 23-05-22 21:04:16, libaokun (A) wrote:
> 在 2022/5/23 17:40, Jan Kara 写道:
> > On Sat 21-05-22 21:42:17, Baokun Li wrote:
> > > When either of the "start + size <= ac->ac_o_ex.fe_logical" or
> > > "start > ac->ac_o_ex.fe_logical" conditions is met, it indicates
> > > that the fe_logical is not in the allocated range.
> > > In this case, it should be bug_ON.
> > >
> > > Fixes: dfe076c106f6 ("ext4: get rid of code duplication")
> > > Signed-off-by: Baokun Li<libaokun1@xxxxxxxxxx>
> > I think this is actually wrong. The original condition checks whether
> > start + size does not overflow the used integer type. Your condition is
> > much stronger and I don't think it always has to be true. E.g. allocation
> > goal block (start variable) can be pushed to larger values by existing
> > preallocation or so.
> >
> > Honza
> >
> I think there are two reasons for this:
>
> First of all, the code here is as follows.
> ```
>         size = end - start;
>         [...]
> if (start + size <= ac->ac_o_ex.fe_logical &&
>                         start > ac->ac_o_ex.fe_logical) {
>                 ext4_msg(ac->ac_sb, KERN_ERR,
>                          "start %lu, size %lu, fe_logical %lu",
>                          (unsigned long) start, (unsigned long) size,
>                          (unsigned long) ac->ac_o_ex.fe_logical);
> BUG();
> }
>         BUG_ON(size <= 0 || size > EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
> ```
> First of all, there is no need to compare with ac_o_ex.fe_logical if it is
> to determine whether there is an overflow.
> Because the previous logic guarantees start < = ac_o_ex.fe_logical, and

How does it guarantee that? The logic:

if (ar->pleft && start <= ar->lleft) {
size -= ar->lleft + 1 - start;
start = ar->lleft + 1;
}

can move 'start' to further blocks...

> limits the scope of size in
> "BUG_ON (size < = 0 | | size > EXT4_BLOCKS_PER_GROUP (ac- > ac_sb))"
> immediately following.

OK, but what guarantees that ac_o_ex.fe_logical < UINT_MAX - size?

> Secondly, the following code flow also reflects this logic.
>
>            ext4_mb_normalize_request
>             >>> start + size <= ac->ac_o_ex.fe_logical
>            ext4_mb_regular_allocator
>             ext4_mb_simple_scan_group
>              ext4_mb_use_best_found
>               ext4_mb_new_preallocation
>                ext4_mb_new_inode_pa
>                 ext4_mb_use_inode_pa
>                  >>> set ac->ac_b_ex.fe_len <= 0
>            ext4_mb_mark_diskspace_used
>             >>> BUG_ON(ac->ac_b_ex.fe_len <= 0);
>
> In ext4_mb_use_inode_pa, you have the following code.
> ```
> start = pa->pa_pstart + (ac->ac_o_ex.fe_logical - pa->pa_lstart);
> end = min(pa->pa_pstart + EXT4_C2B(sbi, pa->pa_len), start + EXT4_C2B(sbi,
> ac->ac_o_ex.fe_len));
> len = EXT4_NUM_B2C(sbi, end - start);
> ac->ac_b_ex.fe_len = len;
> ```
> The starting position in ext4_mb_mark_diskspace_used will be assert.
> BUG_ON(ac->ac_b_ex.fe_len <= 0);
>
> When end == start + EXT4_C2B(sbi, ac->ac_o_ex.fe_len) is used, the value of
> end - start must be greater than 0.
> However, when end == pa->pa_pstart + EXT4_C2B(sbi, pa->pa_len) occurs, this
> bug_ON may be triggered.
> When this bug_ON is triggered, that is,
>
> ac->ac_b_ex.fe_len <= 0
> end - start <= 0
> end <= start
> pa->pa_pstart + EXT4_C2B(sbi, pa->pa_len) <= pa->pa_pstart +
> (ac->ac_o_ex.fe_logical - pa->pa_lstart)
> pa->pa_len <= ac->ac_o_ex.fe_logical - pa->pa_lstart
> pa->pa_lstart + pa->pa_len <= ac->ac_o_ex.fe_logical
> start + size <= ac->ac_o_ex.fe_logical
>
> So I think that "&&" here should be changed to "||".

Sorry, I still disagree. After some more code reading I agree that
ac->ac_o_ex.fe_logical is the logical block where we want allocated blocks
to be placed in the inode so logical extent of allocated blocks should include
ac->ac_o_ex.fe_logical. But I would be reluctant to make assertion you
suggest unless we make sure ac->ac_o_ex.fe_logical in unallocated (in which
case we can also remove some other code from ext4_mb_normalize_request()).

Honza

--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR