Re: 2.4.22pre6aa1

From: Andrea Arcangeli
Date: Sat Aug 16 2003 - 06:59:34 EST


On Sun, Aug 03, 2003 at 09:12:00PM +0400, Sergey S. Kostyliov wrote:
> Hello Andrea,
>
> On Friday 25 July 2003 23:02, Andrea Arcangeli wrote:
> > Hi Sergey,
> >
> > On Fri, Jul 25, 2003 at 03:10:59PM +0400, Sergey S. Kostyliov wrote:
> > > I doubt it depends on bigpages because they
> > > are not used in my setup. But I can live with that. Rule: do not run
> > > `swapoff -a` under load doesn't sound as impossible in my case (if this
> > > is the only way to trigger this problem).
> >
> > can you reproduce it with 2.4.21rc8aa1? If not, then likely it's a
> > 2.5/2.6 bug that went in 2.4 during the backport. I spoke with Hugh an
> > hour ago about this, he will soon look into this too.
>
> Sorry for late responce. I wasn't able to reproduce neither oops nor
> lockup with 2.4.21rc8aa1.

ok good. I'm betting it's the shm backport that destabilized something.
I had no time to look further into it during vacations ;), but the first
suspect thing I mentioned to Hugh during OLS was this:

static void shmem_removepage(struct page *page)
{
if (!PageLaunder(page))
shmem_free_blocks(page->mapping->host, 1);
}

It's not exactly obvious how the accounting should change in function of
the Launder bit. I mean, a writepage can happen even w/o the launder
bitflag set (if it's not invoked by the vm) and I don't see how a msync
or a vm pressure writepage trigger should be different in terms of
accounting of the blocks in an inode.

Overall I need a bit more of time on Monday to digest the whole backport
to be sure of what's going on and if the above is right after all.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/