Re: [RFC][PATCH 2/2] time: alarmtimer: Use TASK_FREEZABLE to cleanup freezer handling

From: John Stultz
Date: Mon Feb 27 2023 - 23:06:27 EST


On Mon, Feb 27, 2023 at 4:03 PM John Stultz <jstultz@xxxxxxxxxx> wrote:
> > On Mon, Feb 20 2023 at 19:11, Michael Nazzareno Trimarchi wrote:
> > +static int alarmtimer_pm_notifier_fn(struct notifier_block *bl, unsigned long state,
> > + void *unused)
> > +{
> > + switch (state) {
> > + case PM_HIBERNATION_PREPARE:
> > + case PM_POST_HIBERNATION:
> > + atomic_set(&alarmtimer_wakeup, 0);
> > + break;
> > + }
> > + return NOTIFY_DONE;
>
> But here, we're setting the alarmtimer_wakeup count to zero if we get
> PM_HIBERNATION_PREPARE or PM_POST_HIBERNATION notifications?
> And Michael noted we need to add PM_SUSPEND_PREPARE and
> PM_POST_SUSPEND there for this to seemingly work.
>
> This zeroing out the counter here feels a little sloppy, as it seems
> nothing prevents the notifier from racing with the other added logic.
>
> If the issue is that when we're expiring timers in alarmtimer_fire(),
> a suspend event may come in midway after the alarmtimer_dequeue(),
> while the timer is running but before the alarmtimer_enqueue(),
> causing recurring timers to not be re-armed, it seems we probably
> should do the accounting fully in alarmtimer_fire(), doing an
> atomic_dec(&alarmtimer_wakeup) at the end of that function. That way
> we avoid suspending while running an alarmtimer, so the recurring
> timers will always be back on the timer list when we do suspend.

Ah. Scratch that. I was testing with my rework of the patch and still
seeing trouble w/ Michael's reproducer.

My apologies, as I was mistaken that the race is in alarmtimer_fired
between the dequeue and the enqueue. It's between the
alarmtimer_fired() and the posixtimer_rearm() logic, where a suspend
inbetween could prevent the timer from being rearmed.

This is messier as we need to allow the timer to fire and re-arm
itself and block suspend happening inbetween, and since the re-arming
logic is happening at the posixtimers abstraction level above the
alarmtimers, its difficult to totally prevent that.

So Thomas's notifier method of zeroing at the begining of suspend and
tracking any wakeups after that point makes more sense now. It still
feels a bit messy, but I'm not sure there's something better.

My only thought is this feels a little bit like its mirroring what the
pm_wakeup_event() logic is supposed to do. Should we be adding a
pm_wakeup_event() to alarmtimer_fired() to try to prevent suspend from
occuring for 500ms or so after an alarmtimer has fired so there is
enough time for it to be re-armed if needed?

thanks
-john