Re: [PATCH v8] powercap/drivers/idle_injection: Add an idle injection framework

From: Viresh Kumar
Date: Tue Jun 19 2018 - 04:49:19 EST


On 19-06-18, 10:00, Daniel Lezcano wrote:
> On 19/06/2018 08:22, Viresh Kumar wrote:
> > On 19-06-18, 07:58, Daniel Lezcano wrote:
> >> +++ b/drivers/powercap/idle_injection.c
> >> @@ -0,0 +1,375 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * Copyright 2018 Linaro Limited
> >> + *
> >> + * Author: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
> >> + *
> >> + * The idle injection framework proposes a way to force a cpu to enter
> >> + * an idle state during a specified amount of time for a specified
> >> + * period.
> >> + *
> >> + * It relies on the smpboot kthreads which handles, via its main loop,
> >> + * the common code for hotplugging and [un]parking.
> >> + *
> >> + * At init time, all the kthreads are created.
> >> + *
> >> + * A cpumask is specified as parameter for the idle injection
> >> + * registering function. The kthreads will be synchronized regarding
> >> + * this cpumask.
> >> + *
> >> + * The idle + run duration is specified via the helpers and then the
> >> + * idle injection can be started at this point.
> >> + *
> >> + * A kthread will call play_idle() with the specified idle duration
> >> + * from above.
> >> + *
> >> + * A timer is set after waking up all the tasks, to the next idle
> >> + * injection cycle.
> >> + *
> >> + * The task handling the timer interrupt will wakeup all the kthreads
> >> + * belonging to the cpumask.
> >> + *
> >> + * Stopping the idle injection is synchonuous, when the function
> >
> > synchronous
> >
> >> + * returns, there is the guarantee there is no more idle injection
> >> + * kthread in activity.
> >> + *
> >> + * It is up to the user of this framework to provide a lock at an
> >> + * upper level to prevent stupid things to happen, like starting while
> >> + * we are unregistering.
> >> + */
> >
> >> +static void idle_injection_wakeup(struct idle_injection_device *ii_dev)
> >> +{
> >> + struct idle_injection_thread *iit;
> >> + unsigned int cpu;
> >> +
> >> + for_each_cpu_and(cpu, to_cpumask(ii_dev->cpumask), cpu_online_mask) {
> >> + iit = per_cpu_ptr(&idle_injection_thread, cpu);
> >> + iit->should_run = 1;
> >> + wake_up_process(iit->tsk);
> >> + }
> >> +}
> >
> > Thread A Thread B
> >
> > CPU3 hotplug out
> > -> idle_injection_park()
> > iit(of-CPU3)->should_run = 0;
> >
> > idle_injection_wakeup()
> > for_each_cpu_and(online)..
> > CPU3-selected
> > clear CPU3 from cpu-online mask.
> >
> >
> > iit(of-CPU3)->should_run = 1;
> > wake_up_process()
> >
> > With the above sequence of events, is it possible that the iit->should_run
> > variable is set to 1 while the CPU is offlined ? And so the crash we discussed
> > in the previous version may still exist ? Sorry I am not able to take my mind
> > away from thinking about these stupid races :(
>
> If I refer to previous Peter's comment about a similar race, I think it
> is possible.
>
> I guess setting the should_run flag to zero in the unpark() must fix the
> issue also.

Right. But since you are already taking the hotplug lock in stop-idle-injection,
you can iterate over all CPUs of a mask instead of the online ones. That would
be one callback less to run at every unpark (though there wouldn't be so many of
them I believe).

--
viresh