Re: Timekeeping issue on aggressive suspend/resume

From: john stultz
Date: Mon Jun 14 2010 - 15:26:20 EST


On Mon, 2010-06-14 at 00:46 -0700, Suresh Rajashekara wrote:
> On Thu, Jun 10, 2010 at 12:52 PM, john stultz <johnstul@xxxxxxxxxx> wrote:
> > I think Thomas was suggesting that you consider creating a option for
> > where CLOCK_MONOTONIC included total_sleep_time.
> >
> > In that case the *hack* (and this is a hack, we'll need some more
> > thoughtful discussion before anything like it could make it upstream)
> > would be in timekeeping_resume() to comment out the lines that update
> > wall_to_monotonic and total_sleep_time.
> >
> > It would be interesting to hear if that hack works for you, and we can
> > try to come up with a better way to think about how to accommodate both
> > views of how to account time over suspend.
>
> Thanks.
>
> I tried this fix. It seemed to help, however the accuracy of sleep
> time for the process was not quite right. A process thread which was
> supposed to wake up every (X) seconds, seemed to wake up every (X -
> delta X) seconds.

Ah, the sleep time is probably too coarse (seconds). We probably need to
increase the granularity from read_persistent_clock() and see if that
helps (although most persistent clocks aren't very fine grained).

> Also another side effect of this change was that the system time was
> no longer in sync with the wall time.

? This doesn't make much sense to me, as you shouldn't be manipulating
xtime differently.

Just to be clear, you mean the value from "date" doesn't match your
watch after resume?

> These problems were more pronounced when the suspend/wakeup cycle time
> was brought down to 0.5 seconds from 4 seconds. The periodicity of
> most of the process threads were disturbed.
>
> I decided to NOT suspend/resume the timekeeping subsystem in the
> kernel and try. It seemed to work. Every application seems to work
> fine.
>
> Now my question is; Is it safe to disable suspend/resume of the
> timekeeping subsystem? Will it have an effect (on
> functionality/performance) which may not surface in my short
> experiments?

Well, the difficultly here is what folks actually mean by suspend. On
some hardware it means everything is powered off, and so on resume we
have to re-init hardware values.

It seems in your case that the hardware isn't completely powered off,
since the clocksource you're using seemed to continue counting while the
system was suspended.

So in this case you might be ok. Your suspend seems closer to an deep
idle state on x86. So suspending timekeeping might not be necessary.

However, you're right that there may be lurking issues:

1) The suspend time would have to be limited to the clocksource's
max_idle_ns value, since after that amount of cycles have past, we might
overflow the accumulation function, or the clocksource may have wrapped.

2) If the hardware does reset the clocksource at some point during the
suspend, you'll have odd time issues.

3) You could run into some difficulty keeping close sync with an NTP
server, as the long delays between accumulation will probably cause an
oscillating over-shoot and over-correction.

I suspect these different definitions of "suspend" on all of the
different hardware types out there is going to be a growing problem in
the near term. Especially as deep idle states start to power off more
hardware and becomes closer to suspend in behavior.

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/