Re: [PATCH 3/3] mm/page_owner: track page free call chain

From: Joonsoo Kim
Date: Mon Jul 04 2016 - 03:26:38 EST


On Mon, Jul 04, 2016 at 02:45:24PM +0900, Sergey Senozhatsky wrote:
> On (07/04/16 14:29), Joonsoo Kim wrote:
> > > > On Sun, Jul 03, 2016 at 01:16:56AM +0900, Sergey Senozhatsky wrote:
> > > > > Introduce PAGE_OWNER_TRACK_FREE config option to extend page owner with
> > > > > free_pages() tracking functionality. This adds to the dump_page_owner()
> > > > > output an additional backtrace, that tells us what path has freed the
> > > > > page.
> > > >
> > > > Hmm... Do you have other ideas to use this feature? Following example is
> > > > just to detect use-after-free and we have other good tools for it
> > > > (KASAN or DEBUG_PAGEALLOC) so I'm not sure whether it's useful or not.
> > >
> > > there is no kasan for ARM32, for example (apart from the fact that
> > > it's really hard to use kasan sometimes due to its cpu cycles and
> > > memory requirements).
> >
> > Hmm... for debugging purpose, KASAN provides many more things so IMO it's
> > better to implement/support KASAN in ARM32 rather than expand
> > PAGE_OWNER for free.
> >
>
> hm, the last time I checked kasan didn't catch that extra put_page() on

Indeed. It seems that kasan only catch double-free of slab object.

> x86_64. AFAIK, kasan on ARM32 is a bit hard to do properly
> http://www.serverphorums.com/read.php?12,1206479,1281087#msg-1281087

Okay.

> I've played with kasan on arm32 (an internal custom version)... and
> extended page_owner turned out to be *incomparably* easier and faster
> to use (especially paired with stackdepot).

Okay.

>
> > > educate me, will DEBUG_PAGEALLOC tell us what path has triggered the
> > > extra put_page()? hm... does ARM32 provide ARCH_SUPPORTS_DEBUG_PAGEALLOC?
> >
> > Hmm... Now, I notice that PAGE_OWNER_TRACK_FREE will detect
> > double-free rather than use-after-free.
>
> well, yes. current hits bad_page(), page_owner helps to find out who
> stole and spoiled it from under current.
>
> CPU a CPU b
>
> alloc_page()
> put_page() << legitimate
> alloc_page()
> err:
> put_page() << legitimate, again.
> << but is actually buggy.
>
> put_page() << double free. but we need
> << to report put_page() from
> << CPU a.

Okay. I think that this patch make finding offending user easier
but it looks like it is a partial solution to detect double-free.
See following example.

CPU a CPU b

alloc_page()
put_page() << legitimate
alloc_page()
err:
put_page() << legitimate, again.
<< but is actually buggy.

alloc_page()

put_page() <<
legitimate,
again.
put_page() << Will report the bug and
page_owner have legitimate call stack.

In kasan, quarantine is used to provide some delay for real free and
it makes use-after-free detection more robust. Double-free also can be
benefit from it. Anyway, I will not object more since it looks
the simplest way to improve doublue-free detection for the page
at least for now.

Thanks.