Re: [PATCH 4/4] tracing, page-allocator: Add a postprocessingscript for page-allocator-related ftrace events

From: Mel Gorman
Date: Thu Aug 06 2009 - 11:48:55 EST


On Wed, Aug 05, 2009 at 12:27:50PM +0200, Johannes Weiner wrote:
> On Wed, Aug 05, 2009 at 10:07:43AM +0100, Mel Gorman wrote:
>
> > I also decided to just deal with the page allocator and not the MM as a whole
> > figuring that reviewing all MM tracepoints at the same time would be too much
> > to chew on and decide "are these the right tracepoints?". My expectation is
> > that there would need to be at least one set per headings;
> >
> > page allocator
> > subsys: kmem
> > prefix: mm_page*
> > example use: estimate zone lock contention
> >
> > o slab allocator (already done)
> > subsys: kmem
> > prefix: kmem_* (although this wasn't consistent, e.g. kmalloc vs kmem_kmalloc)
> > example use: measure allocation times for slab, slub, slqb
> >
> > o high-level reclaim, kswapd wakeups, direct reclaim, lumpy triggers
> > subsys: vmscan
> > prefix: mm_vmscan*
> > example use: estimate memory pressure
> >
> > o low-level reclaim, list rotations, pages scanned, types of pages moving etc.
> > subsys: vmscan
> > prefix: mm_vmscan*
> > (debugging VM tunables such as swappiness or why kswapd so active)
> >
> > The following might also be useful for kernel developers but maybe less
> > useful in general so would be harder to justify.
> >
> > o fault activity, anon, file, swap ins/outs
> > o page cache activity
> > o readahead
> > o VM/FS, writeback, pdflush
> > o hugepage reservations, pool activity, faulting
> > o hotplug
>
> Maybe if more people would tell how they currently use tracepoints in
> the MM we can find some common ground on what can be useful to more
> than one person and why?
>

Not a bad plan at all. I've added patch describing the kmem trace points
and some notes on how they might be used.

> FWIW, I recently started using tracepoints at the following places for
> looking at swap code behaviour:
>
> o swap slot alloc/free [type, offset]
> o swap slot read/write [type, offset]
> o swapcache add/delete [type, offset]
> o swap fault/evict [page->mapping, page->index, type, offset]
>
> This gives detail beyond vmstat's possibilities at the cost of 8 lines
> of trace_swap_foo() distributed over 5 files.
>
> I have not aggregated the output so far, just looked at the raw data
> and enjoyed reading how the swap slot allocator behaves in reality
> (you can probably integrate the traces into snapshots of the whole
> swap space layout),

Can seekwatcher also show the access pattern for swap? Whether it can or not,
you could use points like that to show what correlation, if any, there is
between location on swap and process ownership.

> what load behaviour triggers insane swap IO
> patterns, in what context is readahead reading the wrong pages etc.,
> stuff you wouldn't see when starting out with statistical
> aggregations.
>
> Now, these data are pretty specialized and probably only few people
> will make use of them, but OTOH, the cost they impose on the traced
> code is so miniscule that it would be a much greater pain to 1) know
> about and find third party patches and 2) apply, possibly forward-port
> third party patches.

Somewhat agreed although without seeing the tracepoints and thinking
about how they might be used, I can't say much further.

I think the next round of patches might give a reasonable template on how
tracepoints can be proposed, reviewed and justified.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/