Re: [Kernel Bug] general protection fault in btrfs_lookup_csum

From: Qu Wenruo
Date: Fri Jun 06 2025 - 18:52:38 EST




在 2025/6/7 03:24, Zhiyu Zhang 写道:
Dear Developers and Maintainers,

We would like to report a Linux kernel bug titled "general protection
fault in btrfs_lookup_csum" on Linux-6.12.28, we also reproduce the
PoC on the latest 6.15 kernel. Here are the relevant attachments:

kernel config: https://drive.google.com/file/d/15zwNg6D0mF6eeFOw5zz4QkkH1bcK8xCl/view?usp=sharing
report: https://drive.google.com/file/d/1BPmRKH5Not1_y5briNsAcaYi0hXTe5um/view?usp=sharing
syz reproducer:
https://drive.google.com/file/d/1xvAUqtN1mu-49xfCObEYRn1eFc2Tmk8F/view?usp=sharing
C reproducer: https://drive.google.com/file/d/1cdDqjEqpqhoenhWzxF_GNc06kRkrrjxa/view?usp=sharing

This doesn't feel safe just accessing some unknown source.

Can you let the sysbot to reproduce and forward the report?


The crash happens on every read I/O against a broken btrfs image whose
checksum tree is missing/corrupted. Specifically,
fs/btrfs/file-item.c:search_csum_tree() calls "csum_root =
btrfs_csum_root(fs_info, disk_bytenr);", where csum_root can be NULL
under certain on-disk corruptions. Then btrfs_lookup_csum()
immediately dereferences root->fs_info, causing a general-protection
fault / KASAN report.


Undermost case, if csum tree root is corrupted, btrfs can only be mounted with rescue=ibadroots, and in that case btrfs should set FS_STATE_NO_DATA_CSUM thus no one should trigger the csum tree search at all (btrfs_lookup_bio_sums() will exit early).

The only unknown exception is scrub, which is already fixed by f95d186255b3 ("btrfs: avoid NULL pointer dereference if no valid csum tree").


The call trace just looks like a regular page read, and we didn't have that FS_STATE_NO_DATA_CSUMS set, which isn't correct.

I'd prefer to dig deeper on finding out why.

Thanks,
Qu

--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -201,6 +201,8 @@ btrfs_lookup_csum(struct btrfs_trans_handle *trans,
struct btrfs_path *path,
u64 bytenr, int cow)
{
+ if (unlikely(!root))
+ return ERR_PTR(-EINVAL); /* or -ENOENT, see below */
struct btrfs_fs_info *fs_info = root->fs_info;
int ret;

With this draft patch the PoC no longer panics the kernel.
search_csum_tree() converts -ENOENT (and -EFBIG) to 0, treating the
range as “no checksum” and continuing safely. If we instead return
-EINVAL, the error propagates upward and aborts the read outright. I
am unsure which behaviour is preferred: (1) ENOENT: silently
consistent with existing path handling and avoids spurious I/O errors;
(2) EINVAL: treats the situation as fatal corruption.

Advice on the expected semantics would be appreciated before I submit
a formal patch.

If the issue receives a CVE, we would be grateful to be listed as reporters:
Reported-by: Zhiyu Zhang <zhiyuzhang999@xxxxxxxxx>
Reported-by: Longxing Li <coregee2000@xxxxxxxxx>

Please let us know if a different fix or additional diagnostics are
preferred. We will be happy to respin the patch accordingly.

Thank you for your time!

Best regards,
Zhiyu Zhang