Re: [PATCH v5 1/7] trace: Add trace_ipi_send_cpumask()

From: Valentin Schneider
Date: Wed Mar 22 2023 - 07:26:35 EST


On 22/03/23 11:30, Peter Zijlstra wrote:
> On Wed, Mar 22, 2023 at 10:39:55AM +0100, Peter Zijlstra wrote:
>> On Tue, Mar 07, 2023 at 02:35:52PM +0000, Valentin Schneider wrote:
>> > +TRACE_EVENT(ipi_send_cpumask,
>> > +
>> > + TP_PROTO(const struct cpumask *cpumask, unsigned long callsite, void *callback),
>> > +
>> > + TP_ARGS(cpumask, callsite, callback),
>> > +
>> > + TP_STRUCT__entry(
>> > + __cpumask(cpumask)
>> > + __field(void *, callsite)
>> > + __field(void *, callback)
>> > + ),
>> > +
>> > + TP_fast_assign(
>> > + __assign_cpumask(cpumask, cpumask_bits(cpumask));
>> > + __entry->callsite = (void *)callsite;
>> > + __entry->callback = callback;
>> > + ),
>> > +
>> > + TP_printk("cpumask=%s callsite=%pS callback=%pS",
>> > + __get_cpumask(cpumask), __entry->callsite, __entry->callback)
>> > +);
>>
>> Would it make sense to add a variant like: ipi_send_cpu() that records a
>> single cpu instead of a cpumask. A lot of sites seems to do:
>> cpumask_of(cpu) for that first argument, and it seems to me it is quite
>> daft to have to memcpy a full multi-word cpumask in those cases.
>>
>> Remember, nr_possible_cpus > 64 is quite common these days.
>
> Something we litte bit like so...
>

I was wondering whether we could stick with a single trace event, but let
ftrace be aware of weight=1 vs weight>1 cpumasks.

For weight>1, it would memcpy() as usual, for weight=1, it could write a
pointer to a cpu_bit_bitmap[] equivalent embedded in the trace itself.

Unfortunately, Ftrace bitmasks are represented as a u32 made of two 16 bit
values: [offset in event record, size], so there isn't a straightforward
way to point to a "reusable" cpumask. AFAICT the only alternative would be
to do that via a different trace event, but then we should just go with a
plain old uint - i.e. do what you're doing here, so:

Tested-and-reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx>

(with the tiny typo fix below)

> @@ -35,6 +35,28 @@ TRACE_EVENT(ipi_raise,
> TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), __entry->reason)
> );
>
> +TRACE_EVENT(ipi_send_cpu,
> +
> + TP_PROTO(const unsigned int cpu, unsigned long callsite, void *callback),
> +
> + TP_ARGS(cpu, callsite, callback),
> +
> + TP_STRUCT__entry(
> + __field(unsigned int, cpu)
> + __field(void *, callsite)
> + __field(void *, callback)
> + ),
> +
> + TP_fast_assign(
> + __entry->cpu = cpu;
> + __entry->callsite = (void *)callsite;
> + __entry->callback = callback;
> + ),
> +
> + TP_printk("cpu=%s callsite=%pS callback=%pS",
^
s/s/u/

> + __entry->cpu, __entry->callsite, __entry->callback)
> +);
> +