Re: [PATCH v10 06/14] unwind_user/deferred: Add deferred unwinding interface
From: Peter Zijlstra
Date: Thu Jun 19 2025 - 03:50:42 EST
On Wed, Jun 18, 2025 at 03:09:15PM -0400, Steven Rostedt wrote:
> On Wed, 18 Jun 2025 20:46:20 +0200
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> > > +struct unwind_work;
> > > +
> > > +typedef void (*unwind_callback_t)(struct unwind_work *work, struct unwind_stacktrace *trace, u64 timestamp);
> > > +
> > > +struct unwind_work {
> > > + struct list_head list;
> >
> > Does this really need to be a list? Single linked list like
> > callback_head not good enough?
>
> Doesn't a list head make it easier to remove without having to iterate the
> list?
Yeah, but why would you ever want to remove it? You asked for an unwind,
you get an unwind, no?
> > > static __always_inline void unwind_exit_to_user_mode(void)
> > > {
> > > if (unlikely(current->unwind_info.cache))
> > > current->unwind_info.cache->nr_entries = 0;
> > > + current->unwind_info.timestamp = 0;
> >
> > Surely clearing that timestamp is only relevant when there is a cache
> > around? Better to not add this unconditional write to the exit path.
>
> That's actually not quite true. If the allocation fails, we still want to
> clear the timestamp. But later patches add more data to check and it does
> exit out if there's been no requests:
Well, you could put in an error value on alloc fail or somesuch. Then
its non-zero.
> But for better reviewing, I could add a comment in this patch that states
> that this will eventually exit out early when it does more work.
You're making this really hard to review, why not do it right from the
get-go?
> > > +/* Guards adding to and reading the list of callbacks */
> > > +static DEFINE_MUTEX(callback_mutex);
> > > +static LIST_HEAD(callbacks);
> >
> > Global state.. smells like failure.
>
> Yes, the unwind infrastructure is global, as it is the way tasks know what
> tracer's callbacks to call.
Well, that's apparently how you've set it up. I don't immediately see
this has to be like this.
And there's no comments no nothing.
I don't see why you can't have something like:
struct unwind_work {
struct callback_head task_work;
void *data;
void (*func)(struct unwind_work *work, void *data);
};
void unwind_task_work_func(struct callback_head *task_work)
{
struct unwind_work *uw = container_of(task_work, struct unwind_work, task_work);
// do actual unwind
uw->func(uw, uw->data);
}
or something along those lines. No global state involved.
> > > + guard(mutex)(&callback_mutex);
> > > + list_for_each_entry(work, &callbacks, list) {
> > > + work->func(work, &trace, timestamp);
> > > + }
> >
> > So now you're globally serializing all return-to-user instances. How is
> > that not a problem?
>
> It was the original way we did things. The next patch changes this to SRCU.
> But it requires a bit more care. For breaking up the series, I preferred
> not to add that logic and make it a separate patch.
>
> For better reviewing, I'll add a comment here that says:
>
> /* TODO switch this global lock to SRCU */
Oh ffs :-(
So splitting up patches is for ease of review, but now you're making
splits that make review harder, how does that make sense?