Re: [PATCH v2 3/3] X86: Add a thread cpu time implementation to vDSO

From: Andy Lutomirski
Date: Fri Dec 19 2014 - 12:54:01 EST


On Fri, Dec 19, 2014 at 9:42 AM, Chris Mason <clm@xxxxxx> wrote:
>
>
> On Fri, Dec 19, 2014 at 11:48 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> wrote:
>>
>> On Fri, Dec 19, 2014 at 3:23 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> wrote:
>>>
>>> On Thu, Dec 18, 2014 at 04:22:59PM -0800, Andy Lutomirski wrote:
>>>>
>>>> Bad news: this patch is incorrect, I think. Take a look at
>>>> update_rq_clock -- it does fancy things involving irq time and
>>>> paravirt steal time. So this patch could result in extremely
>>>> non-monotonic results.
>>>
>>>
>>> Yeah, I'm not sure how (and if) we could make all that work :/
>>
>>
>> I obviously can't comment on what Facebook needs, but if I were
>> rigging something up to profile my own code*, I'd want a count of
>> elapsed time, including user, system, and probably interrupt as well.
>> I would probably not want to count time during which I'm not
>> scheduled, and I would also probably not want to count steal time.
>> The latter makes any implementation kind of nasty.
>>
>> The API presumably doesn't need to be any particular clock id for
>> clock_gettime, and it may not even need to be clock_gettime at all.
>>
>> Is perf self-monitoring good enough for this? If not, can we make it
>> good enough?
>>
>> * I do this today using CLOCK_MONOTONIC
>
>
> The clock_gettime calls are used for a wide variety of things, but usually
> they are trying to instrument how much CPU the application is using. So for
> example with the HHVM interpreter they have a ratio of the number of hhvm
> instructions they were able to execute in N seconds of cputime. This gets
> used to optimize the HHVM implementation and can be used as a push blocking
> counter (code can't go in if it makes it slower).
>
> Wall time isn't a great representation of this because it includes factors
> that might be outside a given HHVM patch, but it sounds like we're saying
> almost the same thing.
>
> I'm not familiar with the perf self monitoring?

You can call perf_event_open and mmap the result. Then you can read
the docs^Wheader file.

On the god side, it's an explicit mmap, so all the nasty preemption
issues are entirely moot. And you can count cache misses and such if
you want to be fancy.

On the bad side, the docs are a bit weak, and the added context switch
overhead might be higher.

--Andy

>
> -chris
>
>
>
>



--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/