Re: [PATCH] atmel_tc clocksource/clockevent code

From: David Brownell
Date: Wed Mar 05 2008 - 08:06:25 EST


On Wednesday 05 March 2008, Remy Bohmer wrote:
> Hello David,
>
> > Could you elaborate on where that 50-100 usec gets spent?
>
> Attached I have put a screendump of my ETM debugger. It shows a
> complete flow of kernel function-calls of what happens on a timer
> interrupt. In this example the complete sequence takes about 154 us.

Thanks -- this is quite informative. (Presumably it'd look similar
using NO_HZ too: hardly any overhead is hardware-specific.)

An ETM trace is really nice for this kind of stuff; it'd be nice
if such tech were more widely available! (Built into most ARM cores
and all that ... but the hardware and software tools to access the
data aren't as available.)


> Notice that the ETM is non-intrusive, and that the times are real and
> accurate in this trace. (you can even see the effects of CPU-caches,
> sometimes the same code just runs faster)

Yeah, the intrusive schemes (like automatic probe insertion) perturb
timings at this level.


> > Does the same issue happpen with the $SUBJECT patch (if you tweak the
> > clocksource ratings to use its clockevents on rm9200)?
>
> not tested yet, but I will generate a trace for it, I will post it later.

Based on how little of that time was spent in the rm9200 clockevent
code -- I'll be generous and call it 10 usec -- I'd can't imagine that
could make much of a real difference.


> There is more to it than just the genIRQ mechanism. The softirqs are
> kicked, the scheduler is triggered and so on. It is a waterfall of
> events that happen, just by having a timer interrupt.

Right.


> > Should the min_delta_ns be increased in at91rm9200_time.c then?
>
> Maybe it should be configurable for these kinds of CPUs?

It shouldn't require tweaking individual clockevent devices, or
IMO be specific to e.g. lower powered CPUs ... but a global
min_delta_ns would be easy to implement, and might help.

That'd resemble what the init_timer_deferrable() mechanism
achieves, but the scale for bunching timers would be fine
not coarse.


> Notice that I also fell in this pitfall while using HRT, and I only
> wanted an application that made a 1ms accurate timer... Other
> processes/daemons in the system also uses timers, which eventually
> resulted in intervals in the sub-millisec range, and thus due to the
> overhead that will bring tot the system, the CPU-load just goes
> sky-high, doing actually nothing really special.

In your case, maybe a global min_delta_ns of 1000 * 1000 would
help ... combine with NO_HZ and you'd get the accuracy you need,
with reduced scheduling overhead. Sound about right?

- Dave


> So, hires timestamps -> really really welcome.
> hires timers -> there should be a (configurable) minimal resolution
> that fits the hardware to not overload the CPU.
>
> > Right now, as you probably recall, it's at the lowest value
> > needed for correctness: a smidgeon over two ticks (~ 72 nsec).
>
> I remember...
>
> Kind Regards,
>
> Remy
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/