Re: [PATCH] perf: Fix mux_interval hrtimer wreckage

From: Peter Zijlstra
Date: Tue May 27 2014 - 10:21:20 EST


On Tue, May 27, 2014 at 04:09:48PM +0200, Stephane Eranian wrote:
> On Tue, May 20, 2014 at 11:02 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > Subject: perf: Fix mux_interval hrtimer wreckage
> > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Date: Tue May 20 10:09:32 CEST 2014
> >
> > Thomas stumbled over the hrtimer_forward_now() in
> > perf_event_mux_interval_ms_store() and noticed its broken-ness.
> >
> > You cannot just change the expiry time of an active timer, it will
> > destroy the red-black tree order and cause havoc.
> >
> > Change it to (re)start the timer instead, (re)starting a timer will
> > dequeue and enqueue a timer and therefore preserve rb-tree order.
> >
> > Since we cannot enqueue remotely, wrap the thing in
> > cpu_function_call(), this however mandates that we restrict ourselves
> > to online cpus. Also serialize the entire setting so we don't get
> > multiple concurrent threads trying to update to different values.
> >
> > Also fix a problem in perf_mux_hrtimer_restart(), checking against
> > hrtimer_active() can actually loose us the timer when timer->state ==
> > HRTIMER_STATE_CALLBACK and the callback has already decided NORESTART.
> >
> > Furthermore it doesn't make any sense to test
> > hrtimer_callback_running() when we already tested hrtimer_active(),
> > but with the above change, we explicitly must call it when
> > callback_running.
> >
> > Lastly, rename a few functions:
> >
> > s/perf_cpu_hrtimer_/perf_mux_hrtimer_/ -- because I could not find
> > the mux timer function
> >
> > s/\<hr\>/timer/ -- because that's the normal way of calling things.
> >
> > Fixes: 62b856397927 ("perf: Add sysfs entry to adjust multiplexing interval per PMU")
> > Cc: Stephane Eranian <eranian@xxxxxxxxxx>
> > Reported-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Link: http://lkml.kernel.org/n/tip-ife5kqgnt7mviatc9fakz8wk@xxxxxxxxxxxxxx
>
> So, I tested this patch on tip.git and it panics my kernels as soon as
> I multiplex
> events. For instance running:
> $ perf stat -e cycles,cycles,cycles,cycles,cycles,cycles dd
> if=/dev/urandom of=/dev/null count=10000000
>

Yeah, I hadn't actually tested it, but I did find more hrtimer wreckage
meanwhile and I've not yet figured out how to fix it so I put this on
hold for a little while.

I'll try and get the lot sorted soon though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/