RE: 3.10.1: echo repair > sync_action causes hang on RAID-1 (2 x SSD)

From: Justin Piszcz
Date: Fri Jul 26 2013 - 05:56:59 EST




-----Original Message-----
From: NeilBrown [mailto:neilb@xxxxxxx]
Sent: Thursday, July 25, 2013 8:36 PM
To: Justin Piszcz
Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-raid@xxxxxxxxxxxxxxx
Subject: Re: 3.10.1: echo repair > sync_action causes hang on RAID-1 (2 x
SSD)

On Thu, 25 Jul 2013 19:10:50 -0400 "Justin Piszcz" <jpiszcz@xxxxxxxxxxxxxxx>
wrote:

> Did the fix by chance make it into 3.10.3?

No, it looks like it missed again. I gather there was a large inflow of
patches for -stable in the 3.11-rc1 merge window and Greg has been
processing
them in batches. Hopefully in 3.10.4.

The relevant patch is commit 30bc9b53878a9921b02e3 in mainline.

NeilBrown

--

Method to get patch via git and patch kernel:

$ git clone
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
$ git log |grep 30bc9b53878a9921b02e3
commit 30bc9b53878a9921b02e3b5bc4283ac1c6de102a
$ git show 30bc9b53878a9921b02e3b5bc4283ac1c6de102a > /tmp/a
# patch -p1 < /tmp/a
patching file drivers/md/raid1.c
Hunk #1 succeeded at 1848 (offset -1 lines).
Hunk #2 succeeded at 1886 (offset -1 lines).
Hunk #3 succeeded at 1915 (offset -1 lines).

Reboot- tested, success, thanks..!

One follow-up question:
$ cat /sys/block/md1/md/mismatch_cnt
314112
-> On a live RAID-1 (root filesystem) without swap, is it normal to have
such a high mismatch_cnt even after a repair?

First repair:
Fri Jul 26 05:30:47 EDT 2013: The meta-device /dev/md1 has mismatch_cnt
314112 sectors.
Second repair:
Fri Jul 26 05:30:47 EDT 2013: The meta-device /dev/md1 has mismatch_cnt
313600 sectors.

Should I be concerned?


Testing the patch:

Personalities : [raid1]
md1 : active raid1 sdc2[0] sdb2[1]
233381376 blocks [2/2] [UU]
[>....................] check = 0.3% (838976/233381376)
finish=9.2min speed=419488K/sec

md0 : active raid1 sdc1[0] sdb1[1]
1048512 blocks [2/2] [UU]

Personalities : [raid1]
md1 : active raid1 sdc2[0] sdb2[1]
233381376 blocks [2/2] [UU]
[===============>.....] check = 77.5% (180889856/233381376)
finish=2.5min speed=342654K/sec

md0 : active raid1 sdc1[0] sdb1[1]
1048512 blocks [2/2] [UU]

Personalities : [raid1]
md1 : active raid1 sdc2[0] sdb2[1]
233381376 blocks [2/2] [UU]

md0 : active raid1 sdc1[0] sdb1[1]
1048512 blocks [2/2] [UU]


Justin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/