Re: [PATCH] - race-free suspend. Was: Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

From: Rafael J. Wysocki
Date: Wed Jun 02 2010 - 16:40:18 EST


On Wednesday 02 June 2010, Neil Brown wrote:
> On Tue, 1 Jun 2010 12:50:01 +0200 (CEST)
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> > On Tue, 1 Jun 2010, Neil Brown wrote:
> > >
> > > I think you have acknowledged that there is a race with suspend - thanks.
> > > Next step was "can it be closed".
> > > You seem to suggest that it can, but you describe it as a "work around"
> > > rather than a "bug fix"...
> > >
> > > Do you agree that the race is a "bug", and therefore it is appropriate to
> > > "fix" it assuming an acceptable fix can be found (which I think it can)?
> >
> > If we can fix it, yes we definitely should do and not work around it.
> >
> > Thanks,
> >
> > tglx
>
> OK.
> Here is my suggestion.
>
> While I think this patch would actually work, and hope the ugly aspects are
> reasonably balanced by the simplicity, I present it primarily as a base for
> improvement.
> The important part is to present how drivers and user-space can co-operate
> to avoid losing wake-events. The details of what happens in the kernel are
> certainly up for discussion (as is everything else really of course).
>
> The user-space suspend daemon avoids losing wake-events by using
> fcntl(F_OWNER) to ensure it gets a signal whenever any important wake-event
> is ready to be read by user-space. This may involve:
> - the one daemon processing all wake events
> - Both the suspend daemon and the main event handling daemon opening any
> given device that delivers wake events (this should work with input
> events ... unless grabbing is needed)
> - The event handling daemon giving the suspend-daemon's pid as F_OWNER, and
> using poll/select to get the events itself.
>
> When 'mem' is written to /sys/power/state, suspend_prepare waits in an
> interruptible wait until any wake-event that might have been initiated before
> the suspend was request, has had a chance to be queued for user-space and
> trigger kill_fasync.
> Currently this wait is a configurable time after the last wake-event was
> initiated. This is hackish, but simple and probably adequate.
> If more precise timing is needed and achievable, that can be added later. It
> would be an entirely internal change and would not affect the API further.
> Some of the code developed for suspend-blockers might be a starting point for
> this.
>
> Drivers should call pm_suspend_delay() whenever a wake-event occurs. This
> simply records the time so that the suspend process knows if there is in fact
> any need to wait at all.
>
> The delay to wait after the last pm_suspend_delay() is limited to 10 seconds
> and can be set by kernel parameter suspend_block_delay=number-of-milliseconds
> It defaults to 2 jiffies (which is possibly too short).
>
> - Would this fix the "bug"??
> - and address the issues that suspend-blockers was created to address?
> - or are the requirements on user-space too onerous?

In theory wakeup events can also happen after wait_for_blockers() has returned
0 and I guess we should rollback the suspend in such cases.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/