Re: [PATCH 1/3] timekeeping: NMI safe converter from a given time to monotonic

From: Liang, Kan
Date: Tue Jan 24 2023 - 10:09:24 EST




On 2023-01-24 2:01 a.m., John Stultz wrote:
> On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@xxxxxxxxxxxxxxx> wrote:
>> +int notrace get_mono_fast_from_given_time(int (*get_time_fn)
>> + (struct system_counterval_t *sys_counterval,
>> + void *ctx),
>> + void *ctx,
>> + u64 *mono_ns)
>> +{
>> + struct system_counterval_t system_counterval;
>> + struct tk_fast *tkf = &tk_fast_mono;
>> + u64 cycles, now, interval_start;
>> + struct tk_read_base *tkr;
>> + unsigned int seq;
>> + int ret;
>> +
>> + do {
>> + seq = raw_read_seqcount_latch(&tkf->seq);
>> + tkr = tkf->base + (seq & 0x01);
>> +
>> + ret = get_time_fn(&system_counterval, ctx);
>> + if (ret)
>> + return ret;
>> +
>> + /*
>> + * Verify that the clocksource associated with the given
>> + * timestamp is the same as the currently installed
>> + * timekeeper clocksource
>> + */
>> + if (tkr->clock != system_counterval.cs)
>> + return -EOPNOTSUPP;
>> + cycles = system_counterval.cycles;
>> +
>> + /*
>> + * Check whether the given timestamp is on the current
>> + * timekeeping interval.
>> + */
>> + now = tk_clock_read(tkr);
>> + interval_start = tkr->cycle_last;
>> + if (!cycle_between(interval_start, cycles, now))
>> + return -EOPNOTSUPP;
>
> So. I've not fully thought this out, but it seems like it would be
> quite likely that you'd run into the case where the cycle_last value
> is updated and your earlier TSC timestamp isn't valid for the current
> interval. The get_device_system_crosststamp() logic has a big chunk of
> complex code to try to handle this case by interpolating the cycle
> value back in time. How well does just failing in this case work out?
>

For the case, perf fallback to the time captured in the NMI handler, via
ktime_get_mono_fast_ns().

The TSC in PEBS is captured by HW when the sample was generated. There
should be a small delta compared with the time captured in the NMI
handler. But I think the delta should be acceptable as a backup solution
for the most analysis cases. Also, I don't think the case (the
cycle_last value is updated during the monitoring) should occur very
often either. So I drop the history support to simplify the function.

Thanks,
Kan