Re: [PATCH -next] power/qos: fix a data race in pm_qos_*_value

From: Qian Cai
Date: Mon Feb 24 2020 - 14:02:46 EST


On Mon, 2020-02-24 at 10:54 +0100, Rafael J. Wysocki wrote:
> On Mon, Feb 24, 2020 at 2:01 AM Qian Cai <cai@xxxxxx> wrote:
> >
> >
> >
> > > On Feb 23, 2020, at 7:12 PM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > >
> > > It may be a bug under certain conditions, but you don't mention what
> > > conditions they are. Reporting it as a general bug is not accurate at
> > > the very least.
> >
> > Could we rule out load tearing, store tearing and reload of global_req in cpuidle_governor_latency() for all compilers and architectures which could introduce logic bugs?
> >
> > int global_req = cpu_latency_qos_limit();
> >
> > if (device_req > global_req)
> > device_req = global_req;
> >
> > If under register pressure, the compiler might get ride of the tmp variable, i.e.,
> >
> > If (device_req > cpu_latency_qos_limit())
> > â-> race with the writer.
> > device_req = cpu_latency_qos_limit();
>
> Yes, there is a race here with or without the WRITE_ONCE()/READ_ONCE()
> annotations (note that these annotations don't prevent CPUs from
> reordering things, so device_req may be set before global_req
> regardless).
>
> However, worst-case it may cause an old value to be used and that can
> happen anyway if the entire cpuidle_governor_latency_req() runs
> between the curr_value update and pm_qos_set_value() in
> pm_qos_update_target(), for example.
>
> IOW, there is no guarantee that the new value will be used immediately
> after updating a QoS request anyway.
>
> I agree with adding the annotations (I was considering posting a patch
> doing that myself), but just as a matter of making the intention
> clear.

OK, how about this updated texts?

[PATCH -next] power/qos: annotate a data race in pm_qos_*_value

cpu_latency_constraints.target_value could be accessed concurrently via,

cpu_latency_qos_apply
 pm_qos_update_target
 pm_qos_set_value

cpuidle_governor_latency_req
cpu_latency_qos_limit
pm_qos_read_value

The read is outside pm_qos_lock critical section which results in a data race.
However, the worst case is that an old value to be used and that can happen
anyway, so annotate this data race using a pair of READ|WRITE_ONCE().