Re: 2.6.23.1: mdadm/raid5 hung/d-state

From: Justin Piszcz
Date: Fri Nov 09 2007 - 04:14:49 EST




On Thu, 8 Nov 2007, Carlos Carvalho wrote:

Jeff Lessem (Jeff@xxxxxxxxxx) wrote on 6 November 2007 22:00:
>Dan Williams wrote:
> > The following patch, also attached, cleans up cases where the code looks
> > at sh->ops.pending when it should be looking at the consistent
> > stack-based snapshot of the operations flags.
>
>I tried this patch (against a stock 2.6.23), and it did not work for
>me. Not only did I/O to the effected RAID5 & XFS partition stop, but
>also I/O to all other disks. I was not able to capture any debugging
>information, but I should be able to do that tomorrow when I can hook
>a serial console to the machine.
>
>I'm not sure if my problem is identical to these others, as mine only
>seems to manifest with RAID5+XFS. The RAID rebuilds with no problem,
>and I've not had any problems with RAID5+ext3.

Us too! We're stuck trying to build a disk server with several disks
in a raid5 array, and the rsync from the old machine stops writing to
the new filesystem. It only happens under heavy IO. We can make it
lock without rsync, using 8 simultaneous dd's to the array. All IO
stops, including the resync after a newly created raid or after an
unclean reboot.

We could not trigger the problem with ext3 or reiser3; it only happens
with xfs.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html


Including XFS mailing list as well can you provide more information to them?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/