Re: [RFC PATCH 0/5] Add support for S3 non-stop TSC support.

From: John Stultz
Date: Tue Jan 22 2013 - 15:22:18 EST


On 01/22/2013 11:57 AM, Jason Gunthorpe wrote:
On Mon, Jan 21, 2013 at 10:46:31AM -0800, John Stultz wrote:
What I'd propose is that we break the read_persistent_clock()
functionality up. So we need two interfaces:
1) An interface to access a time value we used to initialize the
system's CLOCK_REALTIME time.
2) An interface to measure the length of suspend.
Interface #1 could be possibly just replaced with the RTCTOSYS
functionality. Although the downside there is that for some time at
bootup between the timekeeping_init() function running (prior to
interrupts being enabled) and the RTC driver being available (after
interrupts are enabled), where we'd have an incorrect system clock.
So we may want to preserve something like the existing
read_persistent_clock() interface, but as Jason suggested, we could
push that access into the RTC driver itself.
How big of an issue is this? Could the RTCTOSYS function be moved to
the moment the RTC driver is registered rather than using a
late_initcall?

It may not be huge. Most early boot items are going to be CLOCK_MONOTONIC based, which would be unaffected. So that's a potential solution, but I'm hesitant to claim there'd be no side effects.


Interface #2 could then be either RTC based, or countinuous counter
based. Since we still want to do this measurement with interrupts
off, we still would need that interrupt-free RTC method like
read_persistent_clock() where supported (falling back to the RTC
driver's suspend/resume handler to try to fix things up as best it
can if that's not available).
Could the counter version of this be bundled into the clocksource
framework? It already has generic APIs for handling cycle counters and
things. Isn't there a TSC driver for clocksource already? Is all that
is missing is a way to tell if the counter survived suspend?

So without *major* rework, I'd rather not do this. Again, the clocksource code has quite a few assumptions built in that are optimized for timekeeping (where we avoid overflows by expecting relatively frequent updates), and very different approaches are needed for something like suspend (where valid suspend times could be potentially months to years).

That said, given ARM's use of clocksources for sched_clock, there may be some reasonable claim to splitting the clocksource mult/shift selection management away from the clocksource itself (instead having it managed by the timekeeping core). That would allow other subsystems to use the clocksource accessor function, managing their own mult/shift selection trade-offs independently.

But such an effort would be of substantial size, given how invasive it would be. So I'm not sure this is likely to happen in the near term without a dedicated effort.

So I think the smaller effort of splitting up the read_persistent_clock() and making it more reasonably handle counters like this S3 non-stop TSC and the 32k counter is a more reasonable mid-step (especially since there will still be the same logical division from a timekeeping perspective between runtime clocksources and suspend-measuring clocksources if we were to do the major overhaul eventually).


clocksource already has suspend/resume callbacks stuff, so the counter
driver could sense if the sleep was too deep and mark itself as
invalid.
But at that point you've lost time. If this was all centrally controlled, you have to know before hand what the bounds would be. With the TSC, we know it won't wrap around our starting measurement for at least 10 years. That's a reasonable range for suspend. We don't want to resume and just get a "oh, bad call, you picked the wrong clocksource for such a long suspend", and really without the clocksource checking with the RTC I don't think it can even know if its been too long itself (since maybe the counter wrapped, but maybe not).

This would help solve the problem on ARM with muxing persistent clock
on multi-platform.

A RTC device flag 'readable with interrupts off' still seems like a
good idea for the RTC case..

Yea, I think this point you're probably right, as I'm warming to the idea of pushing the existing read_persistent_clock() into the rtc device code. Just need to make sure not initializing CLOCK_REALTIME at timekeeping_init isn't going to have negative effects.

thanks
-john


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/