Re: [PATCH V5 1/2] base/drivers/arch_topology: Replace mutex with READ_ONCE / WRITE_ONCE

From: Juri Lelli
Date: Thu Nov 29 2018 - 02:04:47 EST


On 28/11/18 18:54, Daniel Lezcano wrote:
> On 28/11/2018 12:44, Juri Lelli wrote:
> > Hi Daniel,
> >
> > On 27/11/18 14:24, Daniel Lezcano wrote:
> >> The mutex protects a per_cpu variable access. The potential race can
> >> happen only when the cpufreq governor module is loaded and at the same
> >> time the cpu capacity is changed in the sysfs.
> >>
> >> There is no real interest of using a mutex to protect a variable
> >> assignation when there is no situation where a task can take the lock
> >> and block.
> >>
> >> Replace the mutex by READ_ONCE / WRITE_ONCE.
> >>
> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
> >> Cc: Sudeep Holla <sudeep.holla@xxxxxxx>
> >> Reviewed-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> >> ---
> >> drivers/base/arch_topology.c | 7 +------
> >> include/linux/arch_topology.h | 2 +-
> >> 2 files changed, 2 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> >> index edfcf8d..fd5325b 100644
> >> --- a/drivers/base/arch_topology.c
> >> +++ b/drivers/base/arch_topology.c
> >> @@ -31,12 +31,11 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
> >> per_cpu(freq_scale, i) = scale;
> >> }
> >>
> >> -static DEFINE_MUTEX(cpu_scale_mutex);
> >> DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
> >>
> >> void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)
> >> {
> >> - per_cpu(cpu_scale, cpu) = capacity;
> >> + WRITE_ONCE(per_cpu(cpu_scale, cpu), capacity);
> >> }
> >>
> >> static ssize_t cpu_capacity_show(struct device *dev,
> >> @@ -71,10 +70,8 @@ static ssize_t cpu_capacity_store(struct device *dev,
> >> if (new_capacity > SCHED_CAPACITY_SCALE)
> >> return -EINVAL;
> >>
> >> - mutex_lock(&cpu_scale_mutex);
> >> for_each_cpu(i, &cpu_topology[this_cpu].core_sibling)
> >> topology_set_cpu_scale(i, new_capacity);
> >> - mutex_unlock(&cpu_scale_mutex);
> >
> > IIRC this was meant to ensure atomic updates of all siblings with the new
> > capacity value. I actually now wonder if readers should not grab the
> > mutex as well (cpu_capacity_show()). Can't we get into a situation where
> > a reader might see siblings with intermediate values (while the loop
> > above is performing an update)?
>
> With or without this patch, it is the case:
>
> task1 task2
> | |
> read("/sys/.../cpu1/cpu_capacity) |
> | write("/sys/.../cpu1/cpu_capacity")
> read("/sys/.../cpu2/cpu_capacity) |
>
>
> There is no guarantee userspace can have a consistent view of the
> capacity. As soon as it reads a capacity, it can be changed in its back.

True, but w/o the mutex task1 could read different cpu_capacity values
for a cluster (it actually can also with current implementation, we
should grab the mutex in the read path as well if we want to avoid
this). Just pointing this out, not sure it is a problem, though, as even
the notion of strictly equal capacities clusters is going to disappear
AFAIK.