Re: [PATCH RT 3/3] fix get_monotonic_cycles for latency tracer

From: john stultz
Date: Fri Aug 24 2007 - 14:31:21 EST


On Fri, 2007-08-24 at 13:57 -0400, Steven Rostedt wrote:
> The latency tracer on SMP was given crazy results. It was found that the
> get_monotonic_cycles that it uses was not returning a monotonic counter.
> The cause of this was that clock->cycles_raw and clock->cycles_last can
> be updated on another CPU and make the cycles_now variable out-of-date.
> So the delta that was calculated from cycles_now - cycles_last was
> incorrect.
>
> This patch adds a loop to make sure that the cycles_raw and cycles_last
> are consistent through out the calculation (otherwise it performs the
> loop again).
>
> With this patch the latency_tracer can produce normal results again.

Ah! good catch. I totally missed that get_monotonic_cycles was being
called outside of the xtime lock.


> Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
>
> Index: linux-2.6-rt/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6-rt.orig/kernel/time/timekeeping.c 2007-08-24 11:41:04.000000000 -0400
> +++ linux-2.6-rt/kernel/time/timekeeping.c 2007-08-24 11:47:01.000000000 -0400
> @@ -75,15 +75,30 @@ s64 __get_nsec_offset(void)
>
> cycle_t notrace get_monotonic_cycles(void)
> {
> - cycle_t cycle_now, cycle_delta;
> + cycle_t cycle_now, cycle_delta, cycle_raw, cycle_last;
>
> - /* read clocksource: */
> - cycle_now = clocksource_read(clock);
> + do {
> + /*
> + * cycle_raw and cycle_last can change on
> + * another CPU and we need the delta calculation
> + * of cycle_now and cycle_last happen atomic, as well
> + * as the adding to cycle_raw. We don't need to grab
> + * any locks, we just keep trying until get all the
> + * calculations together in one state.
> + */
> + cycle_raw = clock->cycle_raw;
> + cycle_last = clock->cycle_last;
> +
> + /* read clocksource: */
> + cycle_now = clocksource_read(clock);
> +
> + /* calculate the delta since the last update_wall_time: */
> + cycle_delta = (cycle_now - cycle_last) & clock->mask;
>
> - /* calculate the delta since the last update_wall_time: */
> - cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> + } while (cycle_raw != clock->cycle_raw ||
> + cycle_last != clock->cycle_last);

So if I'm understanding this right, not taking a lock isn't an
optimization (as the seq read lock code is almost the same), but a
requirement (as this might be called while xtime_lock is held),
correct?

Might want to clarify that a bit in the comment.


> - return clock->cycle_raw + cycle_delta;
> + return cycle_raw + cycle_delta;
> }
>
> unsigned long notrace cycles_to_usecs(cycle_t cycles)

Otherwise:
Acked-by: John Stultz <johnstul@xxxxxxxxxx>

thanks
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/