Re: [RFC][PATCH] md: avoid fullsync if a faulty member missed a dirty transition

From: Mike Snitzer
Date: Tue May 27 2008 - 10:33:47 EST


On Tue, May 27, 2008 at 2:56 AM, Neil Brown <neilb@xxxxxxx> wrote:
> On Tuesday May 20, snitzer@xxxxxxxxx wrote:
>>
>> Hi Neil,
>>
>> We're much closer. The events_cleared is symmetric on both the failed
>> and active member of the raid1. But there have been some instances
>> where the md thread hits a deadlock during my testing. What follows
>> is the backtrace and live crash info:
> ...
>>
>> So running with your latest patches seems to introduce a race in
>> bitmap_daemon_work's if (unlikely((*bmc & COUNTER_MAX) ==
>> COUNTER_MAX)) { } block.
>
> As you not, that block is in the wrong place.
> It is actually locking up in
> wait_event(bitmap->mddev->sb_wait,
> !test_bit(MD_CHANGE_CLEAN,
> &bitmap->mddev->flags));
>
> which the patch adds. However with my last update that wait_event
> isn't needed any more. I was using it to ensure mddev->events matched
> what was on disk. But we now read mddev->events much earlier and it
> will definitely be on disc by this time.
>
> So: this combined patch should do it.
>
> Thanks for all your testing.
>
> NeilBrown
>
>
> ---------------------------
> Improve setting of "events_cleared" for write-intent bitmaps.
>
> When an array is degraded, bits in the write-intent bitmap are not
> cleared, so that if the missing device is re-added, it can be synced
> by only updated those parts of the device that have changed since
> it was removed.
>
> The enable this a 'events_cleared' value is stored. It is the event
> counter for the array the last time that any bits were cleared.
>
> Sometimes - if a device disappears from an array while it is 'clean' -
> the events_cleared value gets updated incorrectly (there are subtle
> ordering issues between updateing events in the main metadata and the
> bitmap metadata) resulting in the missing device appearing to require
> a full resync when it is re-added.
>
> With this patch, we update events_cleared precisely when we are about
> to clear a bit in the bitmap. We record events_cleared when we clear
> the bit internally, and copy that to the superblock which is written
> out before the bit on storage. This makes it more "obviously correct".
>
> We also need to update events_cleared when the event_count is going
> backwards (as happens on a dirty->clean transition of a non-degraded
> array).
>
> Thanks to Mike Snitzer for identifying this problem and testing early
> "fixes".
>
>
> Cc: "Mike Snitzer" <snitzer@xxxxxxxxx>
> Signed-off-by: Neil Brown <neilb@xxxxxxx>
> Signed-off-by: Neil Brown <neilb@xxxxxxx>

Neil,

Works great now. Thanks.

Tested-by: Mike Snitzer <snitzer@xxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/