Re: [PATCH] bcache: Fix writeback_thread never writing back incomplete stripes.

From: Kent Overstreet
Date: Thu Sep 17 2015 - 12:41:06 EST


On Thu, Sep 17, 2015 at 11:30:17AM -0400, Denis Bychkov wrote:
> Well, it turns out my celebration was a bit premature.
>
> PLEASE, DO NOT APPLY THE PATCH POSTED BY KENT (not the one Vojtech
> posted) ON A PRODUCTION SYSTEM, IT CAUSES DATA CORRUPTION.
>
> The interesting thing is that it somehow damaged the partition that
> was not supposed to receive any writes (the file system was mounted
> read-only), so my guess is that the patch causes the blocks residing
> in the write-back cache to flush to the wrong blocks on the backing
> device.
> Everything was going great until I rebooted and saw this in the log:
>
> [ 19.639082] attempt to access beyond end of device
> [ 19.643984] md1p2: rw=1, want=75497520, limit=62914560
> [ 19.659033] attempt to access beyond end of device
> [ 19.663929] md1p2: rw=1, want=75497624, limit=62914560
> [ 19.669447] attempt to access beyond end of device
> [ 19.674338] md1p2: rw=1, want=75497752, limit=62914560
> [ 19.679195] attempt to access beyond end of device
> [ 19.679199] md1p2: rw=1, want=75498080, limit=62914560
> [ 19.689007] attempt to access beyond end of device
> [ 19.689011] md1p2: rw=1, want=75563376, limit=62914560
> [ 19.699055] attempt to access beyond end of device
> [ 19.699059] md1p2: rw=1, want=79691816, limit=62914560
> [ 19.719246] attempt to access beyond end of device
> [ 19.724144] md1p2: rw=1, want=79691928, limit=62914560
> ......
> (it's a small example, the list was much longer)
> And the next thing I found out the super block on my 10-Tb XFS RAID was gone. :)
> Oh well, it's a good thing I have backups.
> I knew what I was doing when trying the untested patches. I should
> have made the RAID md partition read-only, not the file system. I kind
> of expected that something could have gone wrong with the file system
> I was testing, just did not expect it would fire nukes at the innocent
> bystanders.

Aw, shit. That's just _bizzare_.

I have a theory - it appears that last_scanned isn't getting initialized before
it's used, so it's going to be all 0s the very first time... which it appears
could cause it to slurp up keys from the wrong device (and if that device was
bigger than the correct device, that could explain the accesses beyond the end
of the device).

Currently just a theory though, and I have no clue why it would only be exposed
with my patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/