Re: [PATCH] perf/core: Fix the same task check in perf_event_set_output

From: Liang, Kan
Date: Wed Mar 22 2023 - 15:17:19 EST




On 2023-03-22 10:15 a.m., Adrian Hunter wrote:
> On 22/03/23 15:42, Peter Zijlstra wrote:
>> On Wed, Mar 22, 2023 at 12:59:28PM +0200, Adrian Hunter wrote:
>>> On 11/07/22 21:07, kan.liang@xxxxxxxxxxxxxxx wrote:
>>>> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>>>
>>>> With the --per-thread option, perf record errors out when sampling with
>>>> a hardware event and a software event as below.
>>>>
>>>> $ perf record -e cycles,dummy --per-thread ls
>>>> failed to mmap with 22 (Invalid argument)
>>>>
>>>> The same task is sampled with the two events. The IOC_OUTPUT is invoked
>>>> to share the mmap memory of the task between the events. In the
>>>> perf_event_set_output(), the event->ctx is used to check whether the
>>>> two events are attached to the same task. However, a hardware event and
>>>> a software event are from different task context. The check always
>>>> fails.
>>>>
>>>> The task struct is stored in the event->hw.target for each per-thread
>>>> event. It can be used to determine whether two events are attached to
>>>> the same task.
>>>>
>>>> The patch can also fix another issue reported months ago.
>>>> https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@xxxxxxxxx/
>>>> The event->ctx is not ready when the perf_event_set_output() is invoked
>>>> in the perf_event_open(), while the event->hw.target has been assigned
>>>> at the moment.
>>>>
>>>> The problem should be a long time issue since commit c3f00c70276d
>>>> ("perf: Separate find_get_context() from event initialization"). The
>>>> event->hw.target doesn't exist at that time. Here, the patch which
>>>> introduces the event->hw.target is used by the Fixes tag.
>>>>
>>>> The problem should still exists between the broken patch and the
>>>> event->hw.target patch. This patch does not intend to fix that case.
>>>>
>>>> Fixes: 50f16a8bf9d7 ("perf: Remove type specific target pointers")
>>>> Reviewed-by: Zhengjun Xing <zhengjun.xing@xxxxxxxxxxxxxxx>
>>>> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>>
>>> Did this slip through the cracks, or is there more complexity
>>> to this case than just sharing the rb?
>>
>> Both; I very much missed it, but looking at it now, I'm not at all sure
>> it is correct prior to the whole context rewrite we did recently.
>>
>> So after the rewrite every cpu/task only has a single
>> perf_event_context, and your change below is actually an equivalence.
>>

Right, they are equivalent after the rewrite. I cannot reproduce the
"failed to mmap with 22 (Invalid argument)" issue with 6.2 and later
anymore.

But the issue reported by Nadav* can still be reproduced with the latest
6.3-rc.*
https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@xxxxxxxxx/
Because the event->ctx is still not set when the perf_event_set_output()
is invoked
in the perf_event_open(). The rewrite doesn't change with it.


>> But prior to that a task could have multiple contexts. Now they got
>> co-scheduled most of the times and it will probably work, but I'm not
>> entirely sure.
>>
>> So how about we change the Fixes tag to something like:
>>
>> Fixes: c3f00c70276d ("perf: Separate find_get_context() from event initialization") # >= v6.2
>>

Not sure if we want the # >= v6.2.

For v6.2 and later, the patch can fixes Nadav's issue.

For v6.1 and earlier, the patch fixes both Nadav's issue and the issue
described in the description.

I think I can send a V2 patch to make the history clear in the description.

Thanks,
Kan

>> And anybody that wants to back-port this further gets to either do the
>> full audit and/or keep the pieces.
>>
>> Hmm?
>
> Seems reasonable to me. Kan?
>
>>
>>>> ---
>>>> kernel/events/core.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>>>> index b4d62210c3e5..22df79d3f19d 100644
>>>> --- a/kernel/events/core.c
>>>> +++ b/kernel/events/core.c
>>>> @@ -12080,7 +12080,7 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
>>>> /*
>>>> * If its not a per-cpu rb, it must be the same task.
>>>> */
>>>> - if (output_event->cpu == -1 && output_event->ctx != event->ctx)
>>>> + if (output_event->cpu == -1 && output_event->hw.target != event->hw.target)
>>>> goto out;
>>>>
>>>> /*
>>>
>