Re: [PATCH 1/1] perf/core: fix dangling cgroup pointer in cpuctx
From: David Wang
Date: Tue Jun 03 2025 - 10:04:21 EST
At 2025-06-03 21:41:30, "Yeoreum Yun" <yeoreum.yun@xxxxxxx> wrote:
>Hi David,
>
>> >
>> > > But to fix it, isn't following change less aggressive?
>> > > event_sched_out(event, ctx);
>> > > - perf_event_set_state(event, min(event->state, state));
>> > > if (flags & DETACH_GROUP)
>> > > perf_group_detach(event);
>> > > if (flags & DETACH_CHILD)
>> > > perf_child_detach(event);
>> > > list_del_event(event, ctx);
>> > > + perf_event_set_state(event, min(event->state, state));
>> >
>> > If perf_child_detach() is called first and perf_event_set_state() call,
>> > since the parent is removed in perf_child_detatced,
>> > It would be failed to account the total_enable_time which caculating
>> > child_event's enable_time too.
>>
>> Thanks for clarifying this,
>> So the whole point in commit a3c3c6667 is to make perf_event_set_state() happens before perf_child_detach(), right?
>> I feel I got lost somewhere when I rush to this suggestion. But I still don't understand why my patchv1 breaks commit
>> a3c3c6667, really confused.
>
>I explained this in:
> https://lore.kernel.org/all/5d17f1d7.666d.197348b78d1.Coremail.00107082@xxxxxxx/
>
>>> If there is specific child cpu event specified in cpu 0.
>>> 1. cpu 0 -> active
>>> 2. scheulded to cpu1 -> inactive
>>> 3. close the cpu event from parent -> inactive close
>>>
>>> Can be failed to count total_enable_time.
>
>
>Consider one event which attached to taskctx with specific cpu.
>In case of your original patch is for only "DETACH_EXIT" case.
>Here what I mean, the event is "closed".
>In this case, based on your patch, it doesn't call the perf_event_set_state()
>before list_del_event(), but perf_event_set_state() is called after list_del_event().
Do you mean in this case, the event is not passed to perf_event_exit_event()?
Because in my mind, as long as a event reach perf_event_exit_event, DETACH_EXIT flag would always be set.
perf_event_exit_event()
---> perf_remove_from_context(event, detach_flags | DETACH_EXIT); <---
---> __perf_remove_from_context
----> perf_event_set_state (DETACH_EXIT is always set in this call path)
----> list_del_event
So I am still confused, even with cpu switch, the DETACH_EXIT flag is still there.
Could you explain it with a callchain?
Thanks
David
>
>Thanks
>
>--
>Sincerely,
>Yeoreum Yun