[PATCH AUTOSEL 5.10 23/39] md/raid1: stop mdx_raid1 thread when raid1 array run failed

From: Sasha Levin
Date: Sun Dec 18 2022 - 11:49:10 EST

Next message: Sasha Levin: "[PATCH AUTOSEL 5.10 22/39] drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()"
Previous message: Sasha Levin: "[PATCH AUTOSEL 5.10 20/39] drm/rockchip: Use drm_mode_copy()"
In reply to: Sasha Levin: "[PATCH AUTOSEL 5.10 20/39] drm/rockchip: Use drm_mode_copy()"
Next in thread: Sasha Levin: "[PATCH AUTOSEL 5.10 22/39] drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Jiang Li <jiang.li@xxxxxxxxxx>

[ Upstream commit b611ad14006e5be2170d9e8e611bf49dff288911 ]

fail run raid1 array when we assemble array with the inactive disk only,
but the mdx_raid1 thread were not stop, Even if the associated resources
have been released. it will caused a NULL dereference when we do poweroff.

This causes the following Oops:
[ 287.587787] BUG: kernel NULL pointer dereference, address: 0000000000000070
[ 287.594762] #PF: supervisor read access in kernel mode
[ 287.599912] #PF: error_code(0x0000) - not-present page
[ 287.605061] PGD 0 P4D 0
[ 287.607612] Oops: 0000 [#1] SMP NOPTI
[ 287.611287] CPU: 3 PID: 5265 Comm: md0_raid1 Tainted: G U 5.10.146 #0
[ 287.619029] Hardware name: xxxxxxx/To be filled by O.E.M, BIOS 5.19 06/16/2022
[ 287.626775] RIP: 0010:md_check_recovery+0x57/0x500 [md_mod]
[ 287.632357] Code: fe 01 00 00 48 83 bb 10 03 00 00 00 74 08 48 89 ......
[ 287.651118] RSP: 0018:ffffc90000433d78 EFLAGS: 00010202
[ 287.656347] RAX: 0000000000000000 RBX: ffff888105986800 RCX: 0000000000000000
[ 287.663491] RDX: ffffc90000433bb0 RSI: 00000000ffffefff RDI: ffff888105986800
[ 287.670634] RBP: ffffc90000433da0 R08: 0000000000000000 R09: c0000000ffffefff
[ 287.677771] R10: 0000000000000001 R11: ffffc90000433ba8 R12: ffff888105986800
[ 287.684907] R13: 0000000000000000 R14: fffffffffffffe00 R15: ffff888100b6b500
[ 287.692052] FS: 0000000000000000(0000) GS:ffff888277f80000(0000) knlGS:0000000000000000
[ 287.700149] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 287.705897] CR2: 0000000000000070 CR3: 000000000320a000 CR4: 0000000000350ee0
[ 287.713033] Call Trace:
[ 287.715498] raid1d+0x6c/0xbbb [raid1]
[ 287.719256] ? __schedule+0x1ff/0x760
[ 287.722930] ? schedule+0x3b/0xb0
[ 287.726260] ? schedule_timeout+0x1ed/0x290
[ 287.730456] ? __switch_to+0x11f/0x400
[ 287.734219] md_thread+0xe9/0x140 [md_mod]
[ 287.738328] ? md_thread+0xe9/0x140 [md_mod]
[ 287.742601] ? wait_woken+0x80/0x80
[ 287.746097] ? md_register_thread+0xe0/0xe0 [md_mod]
[ 287.751064] kthread+0x11a/0x140
[ 287.754300] ? kthread_park+0x90/0x90
[ 287.757974] ret_from_fork+0x1f/0x30

In fact, when raid1 array run fail, we need to do
md_unregister_thread() before raid1_free().

Signed-off-by: Jiang Li <jiang.li@xxxxxxxxxx>
Signed-off-by: Song Liu <song@xxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
drivers/md/raid1.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index fb31e5dd54a6..6b5cc3f59fb3 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3115,6 +3115,7 @@ static int raid1_run(struct mddev *mddev)
* RAID1 needs at least one disk in active
*/
if (conf->raid_disks - mddev->degraded < 1) {
+ md_unregister_thread(&conf->thread);
ret = -EINVAL;
goto abort;
}
--
2.35.1

Next message: Sasha Levin: "[PATCH AUTOSEL 5.10 22/39] drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()"
Previous message: Sasha Levin: "[PATCH AUTOSEL 5.10 20/39] drm/rockchip: Use drm_mode_copy()"
In reply to: Sasha Levin: "[PATCH AUTOSEL 5.10 20/39] drm/rockchip: Use drm_mode_copy()"
Next in thread: Sasha Levin: "[PATCH AUTOSEL 5.10 22/39] drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]