Re: [PATCH v2 3/6] ext4: restart handle if credits are insufficient during allocating blocks

From: Zhang Yi
Date: Fri Jun 20 2025 - 01:00:51 EST


On 2025/6/20 0:33, Jan Kara wrote:
> On Wed 11-06-25 19:16:22, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
>>
>> After large folios are supported on ext4, writing back a sufficiently
>> large and discontinuous folio may consume a significant number of
>> journal credits, placing considerable strain on the journal. For
>> example, in a 20GB filesystem with 1K block size and 1MB journal size,
>> writing back a 2MB folio could require thousands of credits in the
>> worst-case scenario (when each block is discontinuous and distributed
>> across different block groups), potentially exceeding the journal size.
>> This issue can also occur in ext4_write_begin() and ext4_page_mkwrite()
>> when delalloc is not enabled.
>>
>> Fix this by ensuring that there are sufficient journal credits before
>> allocating an extent in mpage_map_one_extent() and _ext4_get_block(). If
>> there are not enough credits, return -EAGAIN, exit the current mapping
>> loop, restart a new handle and a new transaction, and allocating blocks
>> on this folio again in the next iteration.
>>
>> Suggested-by: Jan Kara <jack@xxxxxxx>
>> Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx>
>
> ...
>
>> static int _ext4_get_block(struct inode *inode, sector_t iblock,
>> struct buffer_head *bh, int flags)
>> {
>> struct ext4_map_blocks map;
>> + handle_t *handle = ext4_journal_current_handle();
>> int ret = 0;
>>
>> if (ext4_has_inline_data(inode))
>> return -ERANGE;
>>
>> + /* Make sure transaction has enough credits for this extent */
>> + if (flags & EXT4_GET_BLOCKS_CREATE) {
>> + ret = ext4_journal_ensure_extent_credits(handle, inode);
>> + if (ret)
>> + return ret;
>> + }
>> +
>> map.m_lblk = iblock;
>> map.m_len = bh->b_size >> inode->i_blkbits;
>>
>> - ret = ext4_map_blocks(ext4_journal_current_handle(), inode, &map,
>> - flags);
>> + ret = ext4_map_blocks(handle, inode, &map, flags);
>
> Good spotting with ext4_page_mkwrite() and ext4_write_begin() also needing
> this treatment! But rather then hiding the transaction extension in
> _ext4_get_block() I'd do this in ext4_block_write_begin() where it is much
> more obvious (and also it is much more obvious who needs to be prepared for
> handling EAGAIN error). Otherwise the patch looks good!
>

Yes, I completely agree with you. However, unfortunately, do this in
ext4_block_write_begin() only works for ext4_write_begin().
ext4_page_mkwrite() does not call ext4_block_write_begin() to allocate
blocks, it call the vfs helper __block_write_begin_int() instead.

vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
{
...
if (!ext4_should_journal_data(inode)) {
err = block_page_mkwrite(vma, vmf, get_block);
...
}


So...

Thanks,
Yi.