Re: [PATCH] timers: Fix up get_target_base() to use old base properly

From: Thomas Gleixner
Date: Mon Jun 17 2019 - 08:12:45 EST


Peter,

On Mon, 17 Jun 2019, Peter Xu wrote:
> On Mon, Jun 17, 2019 at 08:09:20AM +0200, Thomas Gleixner wrote:
> > You might argue that in case of an explicit pinned timer, the above logic
> > is wrong when the timer is modified as it might move to a different
> > CPU. But from day one when the pinned logic was introduced, pinned just
> > prevents it from being queued on a remote CPU. If you need a timer to stay
> > on a particular CPU even if modified from a remote CPU, then the only way
> > right now is to dequeue and requeue it with add_timer_on().
>
> Indeed. If add_timer_on() should always be used when with pinned
> timers, IMHO it would be good to comment probably above TIMER_PINNED
> about the fact so people will never misuse the interfaces (it seems to
> be mis-used somehow but I cannot be 100% sure, please see below).

Yeah, some documentation would be good.

> > If we really want to change that, then we need to audit all usage sites of
> > pinned timers and figure out whether this would break anything.
> >
> > The proper change would be in that case:
> >
> > return pinned ? base : get_timer_this_cpu_base(tflags);
>
> Purely for curiousity - why would we like to use current cpu base even
> if it's unpinned? My humble opinion is that if we use base directly
> at least we can avoid potential migration of the timer. But I can be
> missing some real reason here...

In most cases it's desired to move the timer over. Assume you have a
network interrupt moving from one cpu to the other and then the tcp timers
would stay on the old cpu forever. So you'd pay the remote access price
every time you touch it and if it fires the callback is pretty much
guaranteed to be cache cold.

> Though, I see two outliers:
>
> ======================
>
> *** drivers/cpufreq/powernv-cpufreq.c:
> powernv_cpufreq_cpu_init[867] TIMER_PINNED | TIMER_DEFERRABLE);
>
> *** net/ipv4/inet_timewait_sock.c:
> inet_twsk_alloc[189] timer_setup(&tw->tw_timer, tw_timer_handler, TIMER_PINNED);

That's fine. It just wants to make sure that the timer is not queued on a
remote CPU if NOHZ is active. That gives them a serialization guarantee of
the network softirq vs. the timer softirq so they can spare some locking
stuff.

Thanks,

tglx