Re: [PATCH v3 12/17] sched: Adapt sched tracepoints for RV task model

From: Peter Zijlstra
Date: Wed Jul 16 2025 - 11:32:20 EST


On Wed, Jul 16, 2025 at 04:38:36PM +0200, Gabriele Monaco wrote:

> So as you said, we can still reconstruct what happened from the trace, but the
> model suddenly needs more states and more events.

So given a sequence like:

trace_sched_enter_tp();
{ trace_irq_disable();
**irq_entry();**
**irq_exit();**
trace_irq_enable(); } * Ni
trace_irq_disable();
{ trace_sched_switch(); } * Nj
trace_irq_enable();
{ trace_irq_disable();
**irq_entry();**
**irq_exit();**
trace_irq_enable(); } * Nk
trace_sched_exit_tp();

It becomes somewhat hard to figure out which exact IRQ disabled section
the switch did not happen in (Nj == 0).

> If we could directly tell whether interrupts were disabled manually or from an
> actual interrupt, that wouldn't be necessary, for instance (as in the original
> model by Daniel).

Hmm.. we do indeed appear to trace the IRQ state before adding
HARDIRQ_OFFSET to preempt_count(). Yes, that complicates things a
little.

So... it *might* be possible to lift lockdep_hardirq_enter() to before
we start tracing. But then you're stuck to running with lockdep
enabled -- I'm thinking that's not ideal, given those other patches you
sent.

I'm going to go on holidays soon, but I've made a note to see if we can
lift setting HARDIRQ_OFFSET before we start tracing. IIRC the current
order is because setting HARDIRQ_OFFSET is using preempt_count_add()
which can be instrumented itself.

But we could use __preempt_count_add() instead, then we loose the
tracing from setting HARDIRQ_OFFSET, but I don't think that is a
problem. We already get the latency from the IRQ tracepoints after all.

> I get your point why we don't really need the additional tracepoint, but some
> arguments giving more context come almost for free.

Right. So please always try and justify adding tracepoints.