null-ptr-deref due to "ext4: fix potential race between online resizing and write operations"

From: Qian Cai
Date: Fri Feb 21 2020 - 09:02:22 EST


Reverted the linux-next commit c20bac9bf82c ("ext4: fix potential race between
s_flex_groups online resizing and access") fixed the crash below (with line
numbers),

struct flex_groups *flex_group = sbi_array_rcu_deref(EXT4_SB(sb),
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ s_flex_groups, g);

[ÂÂ575.924527][T13183] LTP: starting fanotify13
[ÂÂ576.010554][T31835] /dev/zero: Can't open blockdev
[ÂÂ576.867392][T31835] EXT4-fs (loop0): mounting ext3 file system using the ext4
subsystem
[ÂÂ576.919604][T31835] EXT4-fs (loop0): mounted filesystem with ordered data
mode. Opts: (null)
[ÂÂ576.920112][T31835] ext3 filesystem being mounted at /tmp/ltp-
ZMONVGlgwi/o0A0RE/mntpoint supports timestamps until 2038 (0x7fffffff)
[ÂÂ576.948501][T31854] BUG: Kernel NULL pointer dereference on read at
0x00000070
[ÂÂ576.948550][T31854] Faulting instruction address: 0xc008000010501bfc
[ÂÂ576.948573][T31854] Oops: Kernel access of bad area, sig: 11 [#1]
[ÂÂ576.948575][ÂÂÂÂC2] irq event stamp: 107073312
[ÂÂ576.948583][ÂÂÂÂC2] hardirqs lastÂÂenabled at (107073312):
[<c00000000099a174>] _raw_spin_unlock_irqrestore+0x94/0xd0
[ÂÂ576.948595][T31854] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256
DEBUG_PAGEALLOC NUMA PowerNV
[ÂÂ576.948598][T31854] Modules linked in: brd ext4 crc16 mbcache jbd2 loop
ip_tables x_tables xfs sd_mod bnx2x ahci libahci mdio libata tg3 libphy
firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ÂÂ576.948614][ÂÂÂÂC2] hardirqs last disabled at (107073311):
[<c000000000999e0c>] _raw_spin_lock_irqsave+0x3c/0xa0
[ÂÂ576.948646][T31854] CPU: 52 PID: 31854 Comm: fanotify13 Not tainted 5.6.0-
rc2-next-20200221 #7
[ÂÂ576.948689][ÂÂÂÂC2] softirqs lastÂÂenabled at (107073296):
[<c000000000113b3c>] irq_enter+0x8c/0xc0
[ÂÂ576.948693][ÂÂÂÂC2] softirqs last disabled at (107073297):
[<c000000000113cdc>] irq_exit+0x16c/0x1d0
[ÂÂ576.948754][T31854] NIP:ÂÂc008000010501bfc LR: c008000010501d94 CTR:
c0000000001f1e30
[ÂÂ576.948758][T31854] REGS: c00000129f56f700 TRAP: 0300ÂÂÂNot taintedÂÂ(5.6.0-
rc2-next-20200221)
[ÂÂ576.948945][T31854] MSR:ÂÂ9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>ÂÂCR:
24004224ÂÂXER: 20040000
[ÂÂ576.948982][T31854] CFAR: c008000010501d9c DAR: 0000000000000070 DSISR:
40000000 IRQMASK: 0Â
[ÂÂ576.948982][T31854] GPR00: c008000010501d94 c00000129f56f990 c0080000105c1600
0000000000000001Â
[ÂÂ576.948982][T31854] GPR04: c000000001510808 0000000000000008 0000000005cf0ca2
fffffffe5ca98558Â
[ÂÂ576.948982][T31854] GPR08: 0000000000000001 0000000000000070 0000000000000000
c00800001057b690Â
[ÂÂ576.948982][T31854] GPR12: c0000000001f1e30 c000001ffffd5600 000000000000000e
00000000000007ffÂ
[ÂÂ576.948982][T31854] GPR16: c00000129f56fa20 000000000000fff5 0000000000000001
0000000000001dbcÂ
[ÂÂ576.948982][T31854] GPR20: 0000000000000000 000000000000002e 0000000000000800
0000000000000020Â
[ÂÂ576.948982][T31854] GPR24: 000000000000000e 0000000000000000 0000000000000000
c000000001510808Â
[ÂÂ576.948982][T31854] GPR28: c000001206b8d000 c0080000105d8227 c00000129f56fa20
0000000000000001Â
[ÂÂ576.949200][T31854] NIP [c008000010501bfc] get_orlov_stats+0x114/0x390 [ext4]
get_orlov_stats at fs/ext4/ialloc.c:373 (discriminator 11)
[ÂÂ576.949232][T31854] LR [c008000010501d94] get_orlov_stats+0x2ac/0x390 [ext4]
[ÂÂ576.949243][T31854] Call Trace:
[ÂÂ576.949260][T31854] [c00000129f56f990] [c008000010501d94]
get_orlov_stats+0x2ac/0x390 [ext4] (unreliable)
get_orlov_stats at fs/ext4/ialloc.c:373 (discriminator 11)
[ÂÂ576.949301][T31854] [c00000129f56f9f0] [c00800001050231c]
find_group_orlov+0x4a4/0x6b0 [ext4]
find_group_orlov at fs/ext4/ialloc.c:467
[ÂÂ576.949334][T31854] [c00000129f56fae0] [c0080000105055c8]
__ext4_new_inode+0x1450/0x23c0 [ext4]
[ÂÂ576.949367][T31854] [c00000129f56fc50] [c008000010547f2c]
ext4_mkdir+0x104/0x590 [ext4]
[ÂÂ576.949399][T31854] [c00000129f56fd60] [c0000000004cbc64]
vfs_mkdir+0x114/0x210
[ÂÂ576.949432][T31854] [c00000129f56fda0] [c0000000004d1a70]
do_mkdirat+0xb0/0x1a0
[ÂÂ576.949454][T31854] [c00000129f56fe20] [c00000000000b378]
system_call+0x5c/0x68
[ÂÂ576.949465][T31854] Instruction dump:
[ÂÂ576.949473][T31854] 3c620000 e8638730 7f44d378 38630068 48078ccd e8410018
60000000 60000000Â
[ÂÂ576.949497][T31854] 60000000 73490001 4182019c 7b091f24 <7f59482a> 4807a0d1
e8410018 2fa30000Â
[ÂÂ576.949522][T31854] ---[ end trace de4acb29e0d7791c ]---
[ÂÂ577.200573][T31854]Â
[ÂÂ578.200652][T31854] Kernel panic - not syncing: Fatal exception
[ÂÂ579