Re: XFS assertion from truncate. (3.10-rc2)

From: Dave Jones
Date: Thu May 23 2013 - 14:14:03 EST


On Thu, May 23, 2013 at 11:17:21AM -0400, Dave Jones wrote:
> On Thu, May 23, 2013 at 08:09:33AM +1000, Dave Chinner wrote:
> > On Wed, May 22, 2013 at 12:19:46PM -0400, Dave Jones wrote:
> > > On Wed, May 22, 2013 at 10:22:52AM -0400, Dave Jones wrote:
> > > > On Wed, May 22, 2013 at 03:51:47PM +1000, Dave Chinner wrote:
> > > >
> > > > > > Tomorrow I'll also try running some older kernels with the same
> > > > > > options to see if it's something new, or an older bug. This is a
> > > > > > new machine, so it may be something that's been around for a
> > > > > > while, and for whatever reason, my other machines don't hit
> > > > > > this.
> > > > >
> > > > > Another thing that just occurred to me - what compiler are you
> > > > > using? We had a report last week on #xfs that xfsdump was failing
> > > > > with bad checksums because of link time optimisation (LTO) in
> > > > > gcc-4.8.0. When they turned that off, everything worked fine. So if
> > > > > you are using 4.8.0, perhaps trying a different compiler might be a
> > > > > good idea, too.
> > > >
> > > > Yeah, this is 4.8.0. This box is running F19-beta.
> > > > I managed to shoehorn the gcc-4.7 from f18 on there though.
> > > > Bug reproduced instantly, so I think we can rule out compiler.
> > > >
> > > > I ran 3.9 with the same debug options. Seems stable.
> > > > I'll do a bisect.
> > >
> > > good news. It wasn't until I started bisecting I realised I was still
> > > carrying this patch from you to fix slab corruption I was seeing.
> > >
> > > It seems to be the culprit (or is masking another problem -- I had to apply
> > > it at each step of the bisect to get past the slab corruption bug).
> >
> > That doesn't make a whole lot of sense to me. The fix in the xfsdev
> > tree is a little different:
> >
> > http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs.git;a=commitdiff;h=52c24ad39ff02d7bd73c92eb0c926fb44984a41d
>
> I did an rc2 build with just that commit on top, and can't reproduce this at all now.
> (At least not with the reproducer that worked previously).

Ok, scratch all that. I can reproduce it again, it just takes longer.
(And typically, I didn't have your debug patch on top for that run.. nnnngh)

Dave


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/