Re: kernel panic on null pointer on page->mem_cgroup

From: Michal Hocko
Date: Mon Aug 21 2017 - 09:23:58 EST


On Mon 21-08-17 09:02:18, Johannes Weiner wrote:
> On Thu, Aug 10, 2017 at 01:56:05PM +0200, Michal Hocko wrote:
> > On Wed 09-08-17 14:38:25, Johannes Weiner wrote:
> > > The issue is that writeback doesn't hold a page reference and the page
> > > might get freed after PG_writeback is cleared (and the mapping is
> > > unlocked) in test_clear_page_writeback(). The stat functions looking
> > > up the page's node or zone are safe, as those attributes are static
> > > across allocation and free cycles. But page->mem_cgroup is not, and it
> > > will get cleared if we race with truncation or migration.
> >
> > Is there anything that prevents us from holding a reference on a page
> > under writeback?
>
> Hm, I'm hesitant to add redundant life-time management to the page
> there just for memcg, which is not always configured in.
>
> Pinning the memcg instead is slightly more complex, but IMO has the
> complexity in a preferrable place.

If that is the single place that needs such a special handling and it is
very likely to stay that way then the additional complexity is probably
justified. I am just worried that this is really subtle and history
tells us that such a code usually kicks us back later.

> Would you agree?

Well, I was not objecting to the patch. It seems correct I am just
worried a robust fix would be preferable. And a clear object life time
sounds like a more robust thing to do.
--
Michal Hocko
SUSE Labs