Re: [PATCH 0/3] Convert TSC to monotonic clock for PEBS

From: Liang, Kan
Date: Tue Jan 24 2023 - 10:09:21 EST




On 2023-01-24 1:13 a.m., John Stultz wrote:
> On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@xxxxxxxxxxxxxxx> wrote:
>>
>> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>
>> A Processor Event Based Sampling (PEBS) record includes a field that
>> provide the time stamp counter value when the counter was overflowed
>> and the PEBS record was generated. The accurate time stamp can be used
>> to reconcile user samples. However, the current PEBS codes only can
>> convert the time stamp to sched_clock, which is not available from user
>> space. A solution to convert a given TSC to user visible monotonic
>> clock is required.
>>
>> The perf_event subsystem only converts the TSC in a NMI handler. The
>> converter function must be fast and NMI safe.
>>
>> Considered the below two existing functions, but none of them fulfill
>> the above requirements.
>> - The ktime_get_mono_fast_ns() is NMI safe, but it can only return the
>> current clock monotonic rather than a given time's monotonic.
>> - The get_device_system_crosststamp() can calculate the system time from
>> a given device time. But it's not fast and NMI safe.
>
> So, apologies if this is a silly question (my brain quickly evicts the
> details on get_device_system_crosststamp every time I look at it), but
> rather then introducing a new interface, what would it take to rework
> the existing get_device_system_crosststamp() logic to be usable for
> both use cases?
>

I once tried to rework the existing get_device_system_crosststamp() but
I gave up finally, because
- The existing function is already very complex. Adding a new case will
make it more complex. It's not easy to be maintained.
- Perf doesn't need all logic of the existing function. For example, the
history is not required. (I think there is no problem for perf if we
cannot get values for some corner cases. The worst case for perf is to
fallback to the time captured in the NMI handler. It's not very
accurate, but it should be acceptable.). The performance is priority
one. We want a function with much simpler logic.
- If I understand correct, we already introduced several dedicated
functions for fast NMI access, e.g., ktime_get_mono_fast_ns(). I think
we can follow the same idea.


Thanks,
Kan