Re: [PATCH v2] md/raid1: Fix stack memory use after return in raid1_reshape

From: Wang Jinchao
Date: Thu Jun 12 2025 - 03:46:18 EST


在 2025/6/12 15:31, Yu Kuai 写道:
Hi,

在 2025/06/11 16:55, Wang Jinchao 写道:
In the raid1_reshape function, newpool is
allocated on the stack and assigned to conf->r1bio_pool.
This results in conf->r1bio_pool.wait.head pointing
to a stack address.
Accessing this address later can lead to a kernel panic.

Example access path:

raid1_reshape()
{
    // newpool is on the stack
    mempool_t newpool, oldpool;
    // initialize newpool.wait.head to stack address
    mempool_init(&newpool, ...);
    conf->r1bio_pool = newpool;
}

raid1_read_request() or raid1_write_request()
{
    alloc_r1bio()
    {
        mempool_alloc()
        {
            // if pool->alloc fails
            remove_element()
            {
                --pool->curr_nr;
            }
        }
    }
}

mempool_free()
{
    if (pool->curr_nr < pool->min_nr) {
        // pool->wait.head is a stack address
        // wake_up() will try to access this invalid address
        // which leads to a kernel panic
        return;
        wake_up(&pool->wait);
    }
}

Fix:
The solution is to avoid using a stack-based newpool.
Instead, directly initialize conf->r1bio_pool.

Signed-off-by: Wang Jinchao <wangjinchao600@xxxxxxxxx>
---
v1 -> v2:
- change subject
- use mempool_init(&conf->r1bio_pool) instead of reinitializing the list on stack
---
  drivers/md/raid1.c | 34 +++++++++++++++++++---------------
  1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 19c5a0ce5a40..f2436262092a 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3366,7 +3366,7 @@ static int raid1_reshape(struct mddev *mddev)
       * At the same time, we "pack" the devices so that all the missing
       * devices have the higher raid_disk numbers.
       */
-    mempool_t newpool, oldpool;
+    mempool_t oldpool;
      struct pool_info *newpoolinfo;
      struct raid1_info *newmirrors;
      struct r1conf *conf = mddev->private;
@@ -3375,9 +3375,6 @@ static int raid1_reshape(struct mddev *mddev)
      int d, d2;
      int ret;
-    memset(&newpool, 0, sizeof(newpool));
-    memset(&oldpool, 0, sizeof(oldpool));
-
      /* Cannot change chunk_size, layout, or level */
      if (mddev->chunk_sectors != mddev->new_chunk_sectors ||
          mddev->layout != mddev->new_layout ||
@@ -3408,26 +3405,33 @@ static int raid1_reshape(struct mddev *mddev)
      newpoolinfo->mddev = mddev;
      newpoolinfo->raid_disks = raid_disks * 2;
-    ret = mempool_init(&newpool, NR_RAID_BIOS, r1bio_pool_alloc,
-               rbio_pool_free, newpoolinfo);
-    if (ret) {
-        kfree(newpoolinfo);
-        return ret;
-    }
      newmirrors = kzalloc(array3_size(sizeof(struct raid1_info),
-                     raid_disks, 2),
-                 GFP_KERNEL);
+    raid_disks, 2),
+    GFP_KERNEL);
      if (!newmirrors) {
          kfree(newpoolinfo);
-        mempool_exit(&newpool);
          return -ENOMEM;
      }
+    /* stop everything before switching the pool */
      freeze_array(conf, 0);
-    /* ok, everything is stopped */
+    /* backup old pool in case restore is needed */
      oldpool = conf->r1bio_pool;
-    conf->r1bio_pool = newpool;
+
+    ret = mempool_init(&conf->r1bio_pool, NR_RAID_BIOS, r1bio_pool_alloc,
+               rbio_pool_free, newpoolinfo);
+    if (ret) {
+        kfree(newpoolinfo);
+        kfree(newmirrors);
+        mempool_exit(&conf->r1bio_pool);
+        /* restore the old pool */
+        conf->r1bio_pool = oldpool;
+        unfreeze_array(conf);
+        pr_err("md/raid1:%s: cannot allocate r1bio_pool for reshape\n",
+            mdname(mddev));
+        return ret;
+    }
      for (d = d2 = 0; d < conf->raid_disks; d++) {
          struct md_rdev *rdev = conf->mirrors[d].rdev;


Any specific reason not to use mempool_resize() and krealloc() here?
In the case if new raid_disks is greater than the old one.
The element size is different between the old pool and the new pool.
mempool_resize only resizes the pool size (i.e., the number of elements in pool->elements), but does not handle changes in element size, which occurs in raid1_reshape.

Another reason may be to avoid modifying the old pool directly — in case initializing the new pool fails, the old one remains usable.

If we modify the old pool directly and the operation fails, not only will the reshaped RAID be unusable, but the original RAID may also be corrupted.

Thanks,
Kuai