Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags

From: Matt Mackall
Date: Tue Apr 28 2009 - 19:32:53 EST


On Tue, 2009-04-28 at 16:02 -0700, Andrew Morton wrote:
> On Tue, 28 Apr 2009 17:46:34 -0500
> Matt Mackall <mpm@xxxxxxxxxxx> wrote:
>
> > > > +/* a helper function _not_ intended for more general uses */
> > > > +static inline int page_cap_writeback_dirty(struct page *page)
> > > > +{
> > > > + struct address_space *mapping;
> > > > +
> > > > + if (!PageSlab(page))
> > > > + mapping = page_mapping(page);
> > > > + else
> > > > + mapping = NULL;
> > > > +
> > > > + return mapping && mapping_cap_writeback_dirty(mapping);
> > > > +}
> > >
> > > If the page isn't locked then page->mapping can be concurrently removed
> > > and freed. This actually happened to me in real-life testing several
> > > years ago.
> >
> > We certainly don't want to be taking locks per page to build the flags
> > data here. As we don't have any pretense of being atomic, it's ok if we
> > can find a way to do the test that's inaccurate when a race occurs, so
> > long as it doesn't dereference null.
> >
> > But if there's not an obvious way to do that, we should probably just
> > drop this flag bit for this iteration.
>
> trylock_page() could be used here, perhaps.
>
> Then again, why _not_ just do lock_page()? After all, few pages are
> ever locked. There will be latency if the caller stumbles across a
> page which is under read I/O, but so be it?

As I mentioned just a bit ago, it's really not an unreasonable use case
to want to do this on every page in the system back to back. So per page
overhead matters. And the odds of stalling on a locked page when
visiting 1M pages while under load are probably not negligible.

Our lock primitives are pretty low overhead in the fast path, but every
cycle counts. The new tests and branches this code already adds are a
bit worrisome, but on balance probably worth it.

--
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/