Re: [RFC][PATCH 1/2] time: Add ktime_get_mono_raw_fast_ns()

From: Peter Zijlstra
Date: Tue Mar 17 2015 - 07:25:14 EST


On Fri, Feb 20, 2015 at 11:49:49AM -0800, John Stultz wrote:
> On Fri, Feb 20, 2015 at 6:29 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > @@ -489,7 +512,12 @@ static void timekeeping_update(struct ti
> > memcpy(&shadow_timekeeper, &tk_core.timekeeper,
> > sizeof(tk_core.timekeeper));
> >
> > - update_fast_timekeeper(&tk->tkr);
> > + update_fast_timekeeper(&tk_fast_mono, &tk->tkr);
> > +
> > + tkr = tk->tkr;
> > + tkr.mult = tk->tkr.clock->mult;
> > + tkr.shift = tk->tkr.clock->shift;
> > + update_fast_timekeeper(&tk_fast_mono_raw, &tkr);
>
> So this is sort of sneaky and subtle here, which will surely cause
> problems later on. You're copying the original mult/shift pair into a
> copy of the tkr, so you get similar results from timekeeping_get_ns()
> as you'd want from timekeeping_get_ns_raw(). This results in multiple
> ways of getting the raw clock.
>
> I think it would be better to either add a new tkr structure for the
> raw clock in the timekeeper, so you can directly copy it over, or
> extend the tkr structure so it can contain the raw values as well.
>
> Also, there's no real reason to have fast/non-fast versions of the raw
> clock. Since it isn't affected by frequency changes, it can't have the
> inconsistency issues the monotonic clock can see (which are documented
> in the comment near ktime_get_mono_fast_ns()). So we can probably
> condense these and avoid duplicative code.

So I was looking at this again; and I think we do indeed need a second
way to read mono_raw.

So the immediate problem is the tk_core.seq loop around
timekeeping_get_ns_raw(); if the NMI happens during the write side of
that we're stuck.

Now the reason we need that seqcount is because of tk->tkr.cycle_last,
which, afaict, is the only bit that mono_raw needs that changes. With
the possible exception of the suspend/resume paths; we also need to deal
with NMIs requesting time during those.

And we need this silly cycle_last business because of short clocks,
which is something the generic code needs to deal with. And because we
have a bit of an error margin on that we can indeed use a 'stale'
cycle_last.

As to the above suggestions:

- we cannot add another tkr structure to struct timekeeper because that
would duplicate a number of fields, including the above mentioned
cycle_last, and having two of those is bound to cause confusion; and,

- we cannot extend tkr because that would make it overflow the
cacheline.

So as it is, I do not see any other way than the already proposed patch;
we even need the update from timekeeping_update() because we need an
up-to-date cycle_last.

So all I can really do is write a better changelog.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/