Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates

From: Steven Rostedt
Date: Fri Aug 16 2019 - 12:25:47 EST


On Fri, 16 Aug 2019 10:26:43 -0400
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:

> Reading the sched_cmdline_ref and sched_tgid_ref initial state within
> tracing_start_sched_switch without holding the sched_register_mutex is
> racy against concurrent updates, which can lead to tracepoint probes
> being registered more than once (and thus trigger warnings within
> tracepoint.c).
>
> Also, write and read to/from those variables should be done with
> WRITE_ONCE() and READ_ONCE(), given that those are read within tracing
> probes without holding the sched_register_mutex.
>

I understand the READ_ONCE() but is the WRITE_ONCE() truly necessary?
It's done while holding the mutex. It's not that critical of a path,
and makes the code look ugly.

-- Steve



> [ Compile-tested only. I suspect it might fix the following syzbot
> report:
>
> syzbot+774fddf07b7ab29a1e55@xxxxxxxxxxxxxxxxxxxxxxxxx ]
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> CC: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> CC: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
> CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CC: Paul E. McKenney <paulmck@xxxxxxxxxxxxx>
> ---
> kernel/trace/trace_sched_switch.c | 32 ++++++++++++++++++++++----------
> 1 file changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/trace/trace_sched_switch.c b/kernel/trace/trace_sched_switch.c
> index e288168661e1..902e8bf59aeb 100644
> --- a/kernel/trace/trace_sched_switch.c
> +++ b/kernel/trace/trace_sched_switch.c
> @@ -26,8 +26,8 @@ probe_sched_switch(void *ignore, bool preempt,
> {
> int flags;
>
> - flags = (RECORD_TGID * !!sched_tgid_ref) +
> - (RECORD_CMDLINE * !!sched_cmdline_ref);
> + flags = (RECORD_TGID * !!READ_ONCE(sched_tgid_ref)) +
> + (RECORD_CMDLINE * !!READ_ONCE(sched_cmdline_ref));
>
> if (!flags)
> return;
> @@ -39,8 +39,8 @@ probe_sched_wakeup(void *ignore, struct task_struct *wakee)
> {
> int flags;
>
> - flags = (RECORD_TGID * !!sched_tgid_ref) +
> - (RECORD_CMDLINE * !!sched_cmdline_ref);
> + flags = (RECORD_TGID * !!READ_ONCE(sched_tgid_ref)) +
> + (RECORD_CMDLINE * !!READ_ONCE(sched_cmdline_ref));
>
> if (!flags)
> return;
> @@ -89,21 +89,28 @@ static void tracing_sched_unregister(void)
>
> static void tracing_start_sched_switch(int ops)
> {
> - bool sched_register = (!sched_cmdline_ref && !sched_tgid_ref);
> + bool sched_register;
> +
> mutex_lock(&sched_register_mutex);
> + sched_register = (!sched_cmdline_ref && !sched_tgid_ref);
>
> switch (ops) {
> case RECORD_CMDLINE:
> - sched_cmdline_ref++;
> + WRITE_ONCE(sched_cmdline_ref, sched_cmdline_ref + 1);
> break;
>
> case RECORD_TGID:
> - sched_tgid_ref++;
> + WRITE_ONCE(sched_tgid_ref, sched_tgid_ref + 1);
> break;
> +
> + default:
> + WARN_ONCE(1, "Unsupported tracing op: %d", ops);
> + goto end;
> }
>
> - if (sched_register && (sched_cmdline_ref || sched_tgid_ref))
> + if (sched_register)
> tracing_sched_register();
> +end:
> mutex_unlock(&sched_register_mutex);
> }
>
> @@ -113,16 +120,21 @@ static void tracing_stop_sched_switch(int ops)
>
> switch (ops) {
> case RECORD_CMDLINE:
> - sched_cmdline_ref--;
> + WRITE_ONCE(sched_cmdline_ref, sched_cmdline_ref - 1);
> break;
>
> case RECORD_TGID:
> - sched_tgid_ref--;
> + WRITE_ONCE(sched_tgid_ref, sched_tgid_ref - 1);
> break;
> +
> + default:
> + WARN_ONCE(1, "Unsupported tracing op: %d", ops);
> + goto end;
> }
>
> if (!sched_cmdline_ref && !sched_tgid_ref)
> tracing_sched_unregister();
> +end:
> mutex_unlock(&sched_register_mutex);
> }
>