Re: [PATCH] md/raid1,raid10: fix IO handle for REQ_NOWAIT

From: Paul Menzel
Date: Sun Jun 15 2025 - 05:21:41 EST

Next message: Daniel Golle: "Re: [PATCH v2] net: ethernet: mtk_eth_soc: support named IRQs"
Previous message: syzbot: "[syzbot] [kernfs?] possible deadlock in do_page_mkwrite (3)"
In reply to: Zheng Qixing: "Re: [PATCH] md/raid1,raid10: fix IO handle for REQ_NOWAIT"
Next in thread: Yu Kuai: "Re: [PATCH] md/raid1,raid10: fix IO handle for REQ_NOWAIT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dear Zheng,

Am 14.06.25 um 08:50 schrieb Zheng Qixing:

Please disregard the previous reply email, as it contained garbled text.

Thank you for noticing this, and resending.

在 2025/6/13 16:02, Paul Menzel 写道:

Am 12.06.25 um 15:21 schrieb Zheng Qixing:

From: Zheng Qixing <zhengqixing@xxxxxxxxxx>

IO with REQ_NOWAIT should not set R1BIO_Uptodate when it fails,
and bad blocks should also be cleared when REQ_NOWAIT IO succeeds.

It’d be great if you could add an explanation for the *should*. Why should it not be done?

Do you have a reproducer for this?

If we set R1BIO_Uptodate when IO with REQ_NOWAIT fails, the request will
return a success.

Understood. So no command to check for this automatically on a test system.

For the explanation, I guess my problem is, that I was not familiar with REQ_NOWAIT, which means that it fails for blocked IO. (If I am correct.)

But actually it should return BLK_STS_IOERR or BLK_STS_AGAIN, right?

Sorry, I do not know. Hopefully the maintainers can answer this.

Fixes: 9f346f7d4ea7 ("md/raid1,raid10: don't handle IO error for REQ_RAHEAD and REQ_NOWAIT")
Signed-off-by: Zheng Qixing <zhengqixing@xxxxxxxxxx>
---
drivers/md/raid1.c | 11 ++++++-----
drivers/md/raid10.c | 9 +++++----
2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 19c5a0ce5a40..a1cddd24b178 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -455,13 +455,13 @@ static void raid1_end_write_request(struct bio *bio)
      struct md_rdev *rdev = conf->mirrors[mirror].rdev;
      sector_t lo = r1_bio->sector;
      sector_t hi = r1_bio->sector + r1_bio->sectors;
-    bool ignore_error = !raid1_should_handle_error(bio) ||
-        (bio->bi_status && bio_op(bio) == REQ_OP_DISCARD);
+    bool discard_error = bio->bi_status && bio_op(bio) == REQ_OP_DISCARD;

Excuse my ignorance. What is the difference between ignore and discard?

REQ_OP_DISCARD is a operation type while REQ_NOWAIT is just a request flag.

These two can be combined together. IO with REQ_NOWAIT can fail early, even
though the storage medium is fine. So, we better handle this type of
error specially.

I hope this clarifies your doubts.

Sorry about being not clear enough. My question was more about changing the naming of the variable.

      /*
       * 'one mirror IO has finished' event handler:
       */
-    if (bio->bi_status && !ignore_error) {
+    if (bio->bi_status && !discard_error &&
+        raid1_should_handle_error(bio)) {
          set_bit(WriteErrorSeen,    &rdev->flags);
          if (!test_and_set_bit(WantReplacement, &rdev->flags))
              set_bit(MD_RECOVERY_NEEDED, &
@@ -507,12 +507,13 @@ static void raid1_end_write_request(struct bio *bio)
           * check this here.
           */
          if (test_bit(In_sync, &rdev->flags) &&
-            !test_bit(Faulty, &rdev->flags))
+            !test_bit(Faulty, &rdev->flags) &&
+            (!bio->bi_status || discard_error))
              set_bit(R1BIO_Uptodate, &r1_bio->state);
          /* Maybe we can clear some bad blocks. */
          if (rdev_has_badblock(rdev, r1_bio->sector, r1_bio->sectors) &&
-            !ignore_error) {
+            !bio->bi_status) {
              r1_bio->bios[mirror] = IO_MADE_GOOD;
              set_bit(R1BIO_MadeGood, &r1_bio->state);
          }
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index b74780af4c22..1848947b0a6d 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -458,8 +458,8 @@ static void raid10_end_write_request(struct bio *bio)
      int slot, repl;
      struct md_rdev *rdev = NULL;
      struct bio *to_put = NULL;
-    bool ignore_error = !raid1_should_handle_error(bio) ||
-        (bio->bi_status && bio_op(bio) == REQ_OP_DISCARD);
+    bool discard_error = bio->bi_status && bio_op(bio) == REQ_OP_DISCARD;
+    bool ignore_error = !raid1_should_handle_error(bio) || discard_error;
      dev = find_bio_disk(conf, r10_bio, bio, &slot, &repl);
@@ -522,13 +522,14 @@ static void raid10_end_write_request(struct bio *bio)
           * check this here.
           */
          if (test_bit(In_sync, &rdev->flags) &&
-            !test_bit(Faulty, &rdev->flags))
+            !test_bit(Faulty, &rdev->flags) &&
+            (!bio->bi_status || discard_error))
              set_bit(R10BIO_Uptodate, &r10_bio->state);
          /* Maybe we can clear some bad blocks. */
          if (rdev_has_badblock(rdev, r10_bio->devs[slot].addr,
                        r10_bio->sectors) &&
-            !ignore_error) {
+            !bio->bi_status) {
              bio_put(bio);
              if (repl)
                  r10_bio->devs[slot].repl_bio = IO_MADE_GOOD;

Kind regards,

Paul

PS: As it’s two hunks only connected through REQ_NOWAIT, maybe make it two commits: one for raid1 and one for raid10? Feel free to ignore.

Next message: Daniel Golle: "Re: [PATCH v2] net: ethernet: mtk_eth_soc: support named IRQs"
Previous message: syzbot: "[syzbot] [kernfs?] possible deadlock in do_page_mkwrite (3)"
In reply to: Zheng Qixing: "Re: [PATCH] md/raid1,raid10: fix IO handle for REQ_NOWAIT"
Next in thread: Yu Kuai: "Re: [PATCH] md/raid1,raid10: fix IO handle for REQ_NOWAIT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]