Re: Wrong DIF guard tag on ext2 write

From: Chris Mason
Date: Tue Jun 01 2010 - 09:05:51 EST


On Tue, Jun 01, 2010 at 12:30:42PM +0200, Christof Schmitt wrote:
> On Mon, May 31, 2010 at 06:30:05PM +0300, Boaz Harrosh wrote:
> > On 05/31/2010 06:01 PM, James Bottomley wrote:
> > > On Mon, 2010-05-31 at 10:20 -0400, Martin K. Petersen wrote:
> > >>>>>>> "Christof" == Christof Schmitt <christof.schmitt@xxxxxxxxxx> writes:
> > >>
> > >> Christof> Since the guard tags are created in Linux, it seems that the
> > >> Christof> data attached to the write request changes between the
> > >> Christof> generation in bio_integrity_generate and the call to
> > >> Christof> sd_prep_fn.
> > >>
> > >> Yep, known bug. Page writeback locking is messed up for buffer_head
> > >> users. The extNfs folks volunteered to look into this a while back but
> > >> I don't think they have found the time yet.
> > >>
> > >>
> > >> Christof> Using ext3 or ext4 instead of ext2 does not show the problem.
> > >>
> > >> Last I looked there were still code paths in ext3 and ext4 that
> > >> permitted pages to be changed during flight. I guess you've just been
> > >> lucky.
> > >
> > > Pages have always been modifiable in flight. The OS guarantees they'll
> > > be rewritten, so the drivers can drop them if it detects the problem.
> > > This is identical to the iscsi checksum issue (iscsi adds a checksum
> > > because it doesn't trust TCP/IP and if the checksum is generated in
> > > software, there's time between generation and page transmission for the
> > > alteration to occur). The solution in the iscsi case was not to
> > > complain if the page is still marked dirty.
> > >
> >
> > And also why RAID1 and RAID4/5/6 need the data bounced. I wish VFS
> > would prevent data writing given a device queue flag that requests
> > it. So all these devices and modes could just flag the VFS/filesystems
> > that: "please don't allow concurrent writes, otherwise I need to copy data"
> >
> > From what Chris Mason has said before, all the mechanics are there, and it's
> > what btrfs is doing. Though I don't know how myself?
>
> I also tested with btrfs and invalid guard tags in writes have been
> encountered as well (again in 2.6.34). The only difference is that no
> error was reported to userspace, although this might be a
> configuration issue.

This would be a btrfs bug. We have strict checks in place that are
supposed to prevent buffers changing while in flight. What was the
workload that triggered this problem?

>
> What is the best strategy to continue with the invalid guard tags on
> write requests? Should this be fixed in the filesystems?
>

Long term, I think the filesystems shouldn't be changing pages in
flight. Bouncing just hurts way too much.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/