Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]

From: Ville Syrjälä
Date: Thu Oct 27 2016 - 15:21:44 EST


On Thu, Oct 27, 2016 at 08:48:57PM +0200, Thomas Gleixner wrote:
> On Thu, 27 Oct 2016, Ville Syrjälä wrote:
> > On Tue, Aug 09, 2016 at 08:20:57PM +0300, Ville Syrjälä wrote:
> > > On Thu, Jul 14, 2016 at 04:29:42PM +0800, Feng Tang wrote:
> > > > if you only want it to work, you can try an old patch
> > > > https://bugzilla.kernel.org/attachment.cgi?id=76071 from a similar bug
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=41932
> > > >
> > > > Alistair Buxton confirmed it work for 3.18 at least
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=107151#c16
> > >
> > > That patch is a bit too ripe by now. Would need a fresh squeezed one.
> >
> > Since no one else bothered to refresh the patch I did it myself:
> >
> > diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> > index f6aae7977824..d73d094a8972 100644
> > --- a/kernel/time/tick-broadcast.c
> > +++ b/kernel/time/tick-broadcast.c
> > @@ -657,8 +657,16 @@ static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
> > * - There are pending events on sleeping CPUs which were not
> > * in the event mask
> > */
> > - if (next_event.tv64 != KTIME_MAX)
> > + if (next_event.tv64 != KTIME_MAX) {
> > + s64 delta = next_event.tv64 - now.tv64;
> > +
> > + if (delta >= 10000000000) {
> > + printk(KERN_CRIT "%s(): The delta is big: %lld\n", __func__, delta);
> > + next_event.tv64 = now.tv64 + 3000000000;
> > + }
> > +
> > tick_broadcast_set_event(dev, next_cpu, next_event);
> > + }
> >
> > raw_spin_unlock(&tick_broadcast_lock);
> >
> > Unfortunately it doesn't do anything for me.
>
> And I'm not surprised, because the original patch forced a 5 seconds event
> in the broadcast device on resume, aside of limiting the reprogramming.
>
> What that old patch did, was:
>
> 1) Make sure that the broadcast device is actually armed at resume.
>
> That might cause the HPET to resume proper.
>
> 2) Force a max. 3 seconds rearm when the targeted expiry time is > than 10
> seconds
>
> That might make sure that lower C-States are never entered.

Doh. I lost the other hunk somewhere. Let's try that again... And indeed
with the other hunk in tow the machine would appear to resume properly.

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index f6aae7977824..e2173aeeb00c 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -507,8 +507,12 @@ void tick_resume_broadcast(void)
tick_broadcast_start_periodic(bc);
break;
case TICKDEV_MODE_ONESHOT:
- if (!cpumask_empty(tick_broadcast_mask))
+ if (!cpumask_empty(tick_broadcast_mask)) {
tick_resume_broadcast_oneshot(bc);
+ clockevents_program_event(bc,
+ ktime_add_ns(ktime_get(), 5 * NSEC_PER_SEC),
+ 1);
+ }
break;
}
}
@@ -657,8 +661,16 @@ static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
* - There are pending events on sleeping CPUs which were not
* in the event mask
*/
- if (next_event.tv64 != KTIME_MAX)
+ if (next_event.tv64 != KTIME_MAX) {
+ s64 delta = next_event.tv64 - now.tv64;
+
+ if (delta >= 10000000000) {
+ printk(KERN_CRIT "%s(): The delta is big: %lld\n", __func__, delta);
+ next_event.tv64 = now.tv64 + 3000000000;
+ }
+
tick_broadcast_set_event(dev, next_cpu, next_event);
+ }

raw_spin_unlock(&tick_broadcast_lock);

>
> > The fortunate thing is that acpi-idle has magically been fixed in the
> > meantime, so I can at least go back to using that one and have working
> > S3.
>
> What's the lowest C-State with acpi-idle and what's the lowest one with
> intel_idle?

acpi_idle
/sys/devices/system/cpu/cpu0/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x30
/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/latency:100
/sys/devices/system/cpu/cpu0/cpuidle/state3/name:C3
/sys/devices/system/cpu/cpu0/cpuidle/state3/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:200
/sys/devices/system/cpu/cpu0/cpuidle/state3/time:5677316
/sys/devices/system/cpu/cpu0/cpuidle/state3/usage:5920

intel_idle:
/sys/devices/system/cpu/cpu0/cpuidle/state3/desc:MWAIT 0x30
/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/latency:100
/sys/devices/system/cpu/cpu0/cpuidle/state3/name:C4-ATM
/sys/devices/system/cpu/cpu0/cpuidle/state3/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:400
/sys/devices/system/cpu/cpu0/cpuidle/state3/time:7146705
/sys/devices/system/cpu/cpu0/cpuidle/state3/usage:6826

--
Ville Syrjälä
Intel OTC