Re: [PATCH v3] perf: Avoid undefined behavior from stopping/starting inactive events

From: Liang, Kan
Date: Tue Aug 12 2025 - 19:52:05 EST




On 2025-08-12 11:10 a.m., Yunseong Kim wrote:
> Calling pmu->start()/stop() on perf events in PERF_EVENT_STATE_OFF can
> leave event->hw.idx at -1. When PMU drivers later attempt to use this
> negative index as a shift exponent in bitwise operations, it leads to UBSAN
> shift-out-of-bounds reports.
>
> The issue is a logical flaw in how event groups handle throttling when some
> members are intentionally disabled. Based on the analysis and the
> reproducer provided by Mark Rutland (this issue on both arm64 and x86-64).
>
> The scenario unfolds as follows:
>
> 1. A group leader event is configured with a very aggressive sampling
> period (e.g., sample_period = 1). This causes frequent interrupts and
> triggers the throttling mechanism.
> 2. A child event in the same group is created in a disabled state
> (.disabled = 1). This event remains in PERF_EVENT_STATE_OFF.
> Since it hasn't been scheduled onto the PMU, its event->hw.idx remains
> initialized at -1.
> 3. When throttling occurs, perf_event_throttle_group() and later
> perf_event_unthrottle_group() iterate through all siblings, including
> the disabled child event.
> 4. perf_event_throttle()/unthrottle() are called on this inactive child
> event, which then call event->pmu->start()/stop().
> 5. The PMU driver receives the event with hw.idx == -1 and attempts to
> use it as a shift exponent. e.g., in macros like PMCNTENSET(idx),
> leading to the UBSAN report.
>
> The throttling mechanism attempts to start/stop events that are not
> actively scheduled on the hardware.
>
> Move the state check into perf_event_throttle()/perf_event_unthrottle() so
> that inactive events are skipped entirely. This ensures only active events
> with a valid hw.idx are processed, preventing undefined behavior and
> silencing UBSAN warnings. The corrected check ensures true before
> proceeding with PMU operations.
>
> The problem can be reproduced with the syzkaller reproducer:
> Link: https://lore.kernel.org/lkml/714b7ba2-693e-42e4-bce4-feef2a5e7613@xxxxxxxxxxx/
>
> Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> Signed-off-by: Yunseong Kim <ysk@xxxxxxxxxxx>

Thanks for the fix.

Reviewed-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

Thanks,
Kan

> ---
> kernel/events/core.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8060c2857bb2..872122e074e5 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2665,6 +2665,9 @@ static void perf_log_itrace_start(struct perf_event *event);
>
> static void perf_event_unthrottle(struct perf_event *event, bool start)
> {
> + if (event->state != PERF_EVENT_STATE_ACTIVE)
> + return;
> +
> event->hw.interrupts = 0;
> if (start)
> event->pmu->start(event, 0);
> @@ -2674,6 +2677,9 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
>
> static void perf_event_throttle(struct perf_event *event)
> {
> + if (event->state != PERF_EVENT_STATE_ACTIVE)
> + return;
> +
> event->hw.interrupts = MAX_INTERRUPTS;
> event->pmu->stop(event, 0);
> if (event == event->group_leader)