Re: [PATCH] ftrace: Add missing check for existing hwlat thread

From: Steven Rostedt
Date: Wed Aug 01 2018 - 15:40:11 EST


On Wed, 1 Aug 2018 12:45:54 +0200
Erica Bugden <erica.bugden@xxxxxxxxxxxxx> wrote:

> The hwlat tracer uses a kernel thread to measure latencies. The function
> that creates this kernel thread, start_kthread(), can be called when the
> tracer is initialized and when the tracer is explicitly enabled.
> start_kthread() does not check if there is an existing hwlat kernel
> thread and will create a new one each time it is called.
>
> This causes the reference to the previous thread to be lost. Without the
> thread reference, the old kernel thread becomes unstoppable and
> continues to use CPU time even after the hwlat tracer has been disabled.
> This problem can be observed when a system is booted with tracing
> enabled and the hwlat tracer is configured like this:
>
> echo hwlat > current_tracer; echo 1 > tracing_on
>
> Add the missing check for an existing kernel thread in start_kthread()
> to prevent this problem. This function and the rest of the hwlat kernel
> thread setup and teardown are already serialized because they are called
> through the tracer core code with trace_type_lock held.
>
> Signed-off-by: Erica Bugden <erica.bugden@xxxxxxxxxxxxx>
> ---
> kernel/trace/trace_hwlat.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
> index d7c8e4e..2d9d36d 100644
> --- a/kernel/trace/trace_hwlat.c
> +++ b/kernel/trace/trace_hwlat.c
> @@ -354,6 +354,9 @@ static int start_kthread(struct trace_array *tr)
> struct task_struct *kthread;
> int next_cpu;
>
> + if (hwlat_kthread)
> + return 0;
> +

This looks like it is treating the symptom and not the disease.

> /* Just pick the first CPU on first iteration */
> current_mask = &save_cpumask;
> get_online_cpus();

Can you try this patch?

-- Steve

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 823687997b01..15862044db05 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7628,7 +7628,9 @@ rb_simple_write(struct file *filp, const char __user *ubuf,

if (buffer) {
mutex_lock(&trace_types_lock);
- if (val) {
+ if (!!val == tracer_tracing_is_on(tr)) {
+ val = 0; /* do nothing */
+ } else if (val) {
tracer_tracing_on(tr);
if (tr->current_trace->start)
tr->current_trace->start(tr);