Re: Question about LTS 4.19 patch "89047634f5ce NFS: Don't interrupt file writeout due to fatal errors"

From: Greg KH
Date: Mon Oct 30 2023 - 04:43:19 EST


On Mon, Oct 30, 2023 at 04:39:11PM +0800, ChenXiaoSong wrote:
> Hi Trond and Greg:
>
> LTS 4.19 reported null-ptr-deref BUG as follows:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
> Call Trace:
>  nfs_inode_add_request+0x1cc/0x5b8
>  nfs_setup_write_request+0x1fa/0x1fc
>  nfs_writepage_setup+0x2d/0x7d
>  nfs_updatepage+0x8b8/0x936
>  nfs_write_end+0x61d/0xd45
>  generic_perform_write+0x19a/0x3f0
>  nfs_file_write+0x2cc/0x6e5
>  new_sync_write+0x442/0x560
>  __vfs_write+0xda/0xef
>  vfs_write+0x176/0x48b
>  ksys_write+0x10a/0x1e9
>  __se_sys_write+0x24/0x29
>  __x64_sys_write+0x79/0x93
>  do_syscall_64+0x16d/0x4bb
>  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
>
> The reason is: generic_error_remove_page set page->mapping to NULL when nfs
> server have a fatal error:
>
> nfs_updatepage
>   nfs_writepage_setup
>     nfs_setup_write_request
>       nfs_try_to_update_request // return NULL
>         nfs_wb_page // return 0
>           nfs_writepage_locked // return 0
>             nfs_do_writepage // return 0
>               nfs_page_async_flush // return 0
>                 nfs_error_is_fatal_on_server
>                 generic_error_remove_page
>                   truncate_inode_page
>                     delete_from_page_cache
>                       __delete_from_page_cache
>                         page_cache_tree_delete
>                           page->mapping = NULL // this is point
>       nfs_create_request
>         req->wb_page    = page // the page is freed
>       nfs_inode_add_request
>         mapping = page_file_mapping(req->wb_page)
>           return page->mapping
>         spin_lock(&mapping->private_lock) // mapping is NULL
>
> It is reasonable by reverting the patch "89047634f5ce NFS: Don't interrupt
> file writeout due to fatal errors" to fix this bug?

Try it and see, but note, that came from the 4.19.99 release which was
released years ago, are you sure you are using the most recent 4.19.y
release?

> This patch is one patch of patchset [Fix up soft mounts for NFSv4.x](https://lore.kernel.org/all/20190407175912.23528-1-trond.myklebust@xxxxxxxxxxxxxxx/),
> the patchset replace custom error reporting mechanism. it seams that we
> should merge all the patchset to LTS 4.19, or all patchs should not be
> merged. And the "Fixes:" label is not correct, this patch is a refactoring
> patch, not for fixing bugs.

If we missed some patches, that should be added on top of the current
tree, please let us know the git commit ids of them after you have
tested them that they work properly, and we will gladly apply them.

thanks,

greg k-h