Re: [PATCH 2/2] timers: retpoline mitigation for time funcs

From: Thomas Gleixner
Date: Mon Apr 11 2022 - 05:21:54 EST


Pete,

On Sun, Apr 10 2022 at 14:14, Thomas Gleixner wrote:
> On Fri, Feb 18 2022 at 14:18, Pete Swain wrote:
>> diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
>> index 003ccf338d20..ac15412e87c4 100644
>> --- a/kernel/time/clockevents.c
>> +++ b/kernel/time/clockevents.c
>> @@ -245,7 +245,8 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
>>
>> dev->retries++;
>> clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
>> - if (dev->set_next_event((unsigned long) clc, dev) == 0)
>> + if (INDIRECT_CALL_1(dev->set_next_event, lapic_next_deadline,
>> + (unsigned long) clc, dev) == 0)
>
> No. We are not sprinkling x86'isms into generic code.
>
>> --- a/kernel/time/timekeeping.c
>> +++ b/kernel/time/timekeeping.c
>> @@ -190,7 +190,7 @@ static inline u64 tk_clock_read(const struct tk_read_base *tkr)
>> {
>> struct clocksource *clock = READ_ONCE(tkr->clock);
>>
>> - return clock->read(clock);
>> + return INDIRECT_CALL_1(clock->read, read_tsc, clock);
>
> Again. No X86 muck here.

Just for clarification. I have absolutely no interest in opening this
can of worms. The indirect call is in general more expensive on some
architectures, so when we grant this for x86, then the next architecture
comes around the corner and wants the same treatment. Guess how well
that works and how maintainable this is.

And no, you can't just go there and have a

#define arch_read_favourite_clocksource read_foo

because we have multiplatform kernels where the clocksource is selected
at runtime which means every platform wants to be the one defining this.

You can do that in your own frankenkernels, but for mainline this is not
going to fly. You have to come up with something smarter than just
taking your profiling results and slapping INDIRECT_CALL_*() all over
the place. INDIRECT_CALL is a hack as it leaves the conditional around
even if retpoline gets patched out.

The kernel has mechanisms to do better than that.

Let's look at tk_clock_read(). tkr->clock changes usually once maybe
twice during boot. Until shutdown it might change when e.g. TSC becomes
unstable and there are a few other cases like suspend/resume. But none
of these events are hotpath.

While we have several instances of tk_read_base, all of them have the
same clock pointer, except for the rare case of suspend/resume. It's
redundant storage for various reasons.

So with some thought this can be implemented with static_call() which is
currently supported by x86 and powerpc32, but there is no reason why
it can't be supported by other architectures. INDIRECT_CALL is a x86'ism
with a dependency on RETPOLINE, which will never gain traction outside
of x86.

Thanks,

tglx