Re: [PATCH v2] sched: Enabled schedstat when schedstat tracepoints are enabled

From: Steven Rostedt
Date: Thu Apr 13 2017 - 10:08:21 EST


On Thu, 13 Apr 2017 11:01:24 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Thu, Apr 13, 2017 at 11:00:05AM +0200, Peter Zijlstra wrote:
> > On Wed, Apr 12, 2017 at 10:56:07PM -0400, Steven Rostedt wrote:
> > > From: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
> > >
> > > During my tests, I see this in my dmesg:
> > >
> > > "Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and
> > > stat_runtime require the kernel parameter schedstats=enabled or
> > > kernel.sched_schedstats=1"
> > >
> > > And found the commit:
> > >
> > > cb2517653fc ("sched/debug: Make schedstats a runtime tunable that is
> > > disabled by default")
> > >
> > > Which states:
> > >
> > > "For tracepoints, there is a simple warning as it's not safe to activate
> > > schedstats in the context when it's known the tracepoint may be wanted
> > > but is unavailable."
> > >
> > > I'm assuming that Mel did not know about the TRACE_EVENT_FN() and
> > > DEFINE_EVENT_FN() that allow for callbacks for tracepoints as they are
> > > enabled and disabled. I do not see any reason for not enabling
> > > schedstat when one of its tracepoints are enabled.
> > >
> > > The state of schedstat is saved when the first tracepoint is enabled,
> > > and that state is put back when the tracepoints are disabled.
> >
> > There is one additional complication with all this.
> >
> > Dynamically enabling the sched_stats like this doesn't guarantee correct
> > information. So you've now taken away the informational print and
> > silently generate bollocks numbers.
>
> Josh has a patch like:
>
> http://lkml.kernel.org/r/00e8805cc094657d5ccb20bb88e0d2ce1cbb92e1.1466184592.git.jpoimboe@xxxxxxxxxx
>
> That tries to keep the few stats required for the tracepoint active at
> all times. But I never managed to get performance overhead to 0.

Interesting. I may take a look at that patch too. What benchmark did
you use to measure performance?

Thanks!

-- Steve