Re: [PATCH] perf/core: Install cgroup event via IPI

From: Peter Zijlstra
Date: Mon Jan 20 2020 - 03:50:45 EST


On Thu, Jan 16, 2020 at 09:25:55AM -0800, Song Liu wrote:
> cgroup events in OFF state cannot be installed without IPI, otherwise, it
> may trigger the following calltrace with CONFIG_DEBUG_LIST:
>
> [ 31.776974] ------------[ cut here ]------------
> [ 31.777570] list_add double add: new=ffff888ff7cf0db0, prev=ffff888ff7ce82f0, next=ffff888ff7cf0db0.
> [ 31.778737] WARNING: CPU: 3 PID: 1186 at lib/list_debug.c:31 __list_add_valid+0x67/0x70
> [ 31.779745] Modules linked in:
> [ 31.780138] CPU: 3 PID: 1186 Comm: perf Tainted: G W 5.5.0-rc6+ #3962
> [ 31.781125] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> [ 31.782199] RIP: 0010:__list_add_valid+0x67/0x70

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index a1f8bde19b56..36e8fe27e2a1 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2682,14 +2682,18 @@ perf_install_in_context(struct perf_event_context *ctx,
> smp_store_release(&event->ctx, ctx);
>
> /*
> - * perf_event_attr::disabled events will not run and can be initialized
> - * without IPI. Except when this is the first event for the context, in
> - * that case we need the magic of the IPI to set ctx->is_active.
> + * perf_event_attr::disabled events will not run and can be
> + * initialized without IPI. Except:
> + * 1. when this is the first event for the context, in that case
> + * we need the magic of the IPI to set ctx->is_active;
> + * 2. cgroup event in OFF state, because it is installed in the
> + * cpuctx.
> *
> * The IOC_ENABLE that is sure to follow the creation of a disabled
> * event will issue the IPI and reprogram the hardware.
> */
> - if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF && ctx->nr_events) {
> + if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF &&
> + !is_cgroup_event(event) && ctx->nr_events) {
> raw_spin_lock_irq(&ctx->lock);
> if (ctx->task == TASK_TOMBSTONE) {
> raw_spin_unlock_irq(&ctx->lock);

I don't think this is right. Because cgroup events are always per-cpu
events, ctx == &cpuctx->ctx, so the locking should work out just fine.

What does appear to be the problem is that:

add_event_to_ctx()
list_update_cgroup_event()
cpuctx = __get_cpu_context(ctx)

uses this_cpu_ptr() and we're now calling it from the 'wrong' CPU.

But I'm thinking the below should also work just fine, hmm?

---

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2173c23c25b4..2c6134604811 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -951,9 +951,9 @@ list_update_cgroup_event(struct perf_event *event,

/*
* Because cgroup events are always per-cpu events,
- * this will always be called from the right CPU.
+ * @ctx == &cpuctx->cpu.
*/
- cpuctx = __get_cpu_context(ctx);
+ cpuctx = container_of(ctx, struct perf_cpu_context, ctx);

/*
* Since setting cpuctx->cgrp is conditional on the current @cgrp