Re: [RFC 1/2] perf core: Add PERF_COUNT_SW_CGROUP_SWITCHES event

From: Peter Zijlstra
Date: Thu Dec 03 2020 - 02:46:26 EST


On Thu, Dec 03, 2020 at 11:10:30AM +0900, Namhyung Kim wrote:
> On Thu, Dec 3, 2020 at 1:19 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 9a38f579bc76..5eb284819ee5 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -1174,25 +1174,19 @@ DECLARE_PER_CPU(struct pt_regs, __perf_regs[4]);
> > * which is guaranteed by us not actually scheduling inside other swevents
> > * because those disable preemption.
> > */
> > -static __always_inline void
> > -perf_sw_event_sched(u32 event_id, u64 nr, u64 addr)
> > +static __always_inline void __perf_sw_event_sched(u32 event_id, u64 nr, u64 addr)
>
> It'd be nice to avoid the __ prefix if possible.

Not having __ would seem to suggest its a function of generic utility.
Still, *shrug* ;-)

> > {
> > - if (static_key_false(&perf_swevent_enabled[PERF_COUNT_SW_CPU_MIGRATIONS]))
> > - return true;
> > - return false;
> > + return static_key_false(&perf_swevent_enabled[swevt]);
> > }
> >
> > static inline void perf_event_task_migrate(struct task_struct *task)
> > @@ -1207,11 +1201,9 @@ static inline void perf_event_task_sched_in(struct task_struct *prev,
> > if (static_branch_unlikely(&perf_sched_events))
> > __perf_event_task_sched_in(prev, task);
> >
> > - if (perf_sw_migrate_enabled() && task->sched_migrated) {
> > - struct pt_regs *regs = this_cpu_ptr(&__perf_regs[0]);
> > -
> > - perf_fetch_caller_regs(regs);
> > - ___perf_sw_event(PERF_COUNT_SW_CPU_MIGRATIONS, 1, regs, 0);
> > + if (__perf_sw_enabled(PERF_COUNT_SW_CPU_MIGRATIONS) &&
> > + task->sched_migrated) {
>
> It seems task->sched_migrate is set only if the event is enabled,
> then can we just check the value here?

Why suffer the unconditional load and test? Your L1 too big?

> > + __perf_sw_event_sched(PERF_COUNT_SW_CPU_MIGRATIONS, 1, 0);
> > task->sched_migrated = 0;
> > }
> > }
> > @@ -1219,7 +1211,13 @@ static inline void perf_event_task_sched_in(struct task_struct *prev,
> > static inline void perf_event_task_sched_out(struct task_struct *prev,
> > struct task_struct *next)
> > {
> > - perf_sw_event_sched(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 0);
> > + if (__perf_sw_enabled(PERF_COUNT_SW_CONTEXT_SWITCHES))
> > + __perf_sw_event_sched(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 0);
> > +
> > + if (__perf_sw_enabled(PERF_COUNT_SW_CGROUP_SWITCHES) &&
> > + (task_css_check(prev, perf_event_cgrp_id, 1)->cgroup !=
> > + task_css_check(next, perf_event_cgrp_id, 1)->cgroup))
> > + __perf_sw_event_sched(PERF_COUNT_SW_CGROUP_SWITCHES, 1, 0);
>
> I was not clear about the RCU protection here. Is it ok to access
> the task's css_set directly?

We're here with preemption and IRQs disabled, good luck trying to get
RCU to consider that not a critical section and spirit things away under
us.