Re: Attempted summary of suspend-blockers LKML thread

From: kevin granade
Date: Thu Aug 05 2010 - 16:51:44 EST


On Thu, Aug 5, 2010 at 3:31 PM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Aug 05, 2010 at 01:13:31PM -0500, kevin granade wrote:
>> On Thu, Aug 5, 2010 at 10:46 AM,  <david@xxxxxxx> wrote:
>> > On Thu, 5 Aug 2010, Paul E. McKenney wrote:
>> >
>> >> On Wed, Aug 04, 2010 at 10:18:40PM -0700, david@xxxxxxx wrote:
>> >>>
>> >>> On Wed, 4 Aug 2010, Paul E. McKenney wrote:
>> >>>>
>> >>>> On Wed, Aug 04, 2010 at 05:25:53PM -0700, david@xxxxxxx wrote:
>> >>>>>
>> >>>>> On Wed, 4 Aug 2010, Paul E. McKenney wrote:
>> >>
>> >> [ . . . ]
>> >>
>> >>>>>> The music player is an interesting example.  It would be idle most
>> >>>>>> of the time, given that audio output doesn't consume very much CPU.
>> >>>>>> So you would not want to suspend the system just because there were
>> >>>>>> no runnable processes.  In contrast, allowing the music player to
>> >>>>>> hold a wake lock lets the system know that it would not be appropriate
>> >>>>>> to suspend.
>> >>>>>>
>> >>>>>> Or am I misunderstanding what you are proposing?
>> >>>>>
>> >>>>> the system would need to be idle for 'long enough' (configurable)
>> >>>>> before deciding to suspend, so as long as 'long enough' is longer
>> >>>>> than the music player is idle this would not be a problem.
>> >>>>
>> >>>> From a user standpoint, having the music player tell the system when
>> >>>> it is OK to suspend (e.g., when the user has paused playback) seems
>> >>>> a lot nicer than having configurable timeouts that need tweaking.
>> >>>
>> >>> every system that I have seen has a configurable "sleep if it's idle
>> >>> for this long" knob. On the iphone (work issue, I didn't want it)
>> >>> that I am currently using it can be configured from 1 min to 5 min.
>> >>>
>> >>> this is the sort of timeout I am talking about.
>> >>>
>> >>> with something in the multi-minute range for the 'do a full suspend'
>> >>> doing a wakeup every few 10s of seconds is perfectly safe.
>> >>
>> >> Ah, I was assuming -much- shorter "do full suspend" timeouts.
>> >>
>> >> My (possibly incorrect) assumption is based on the complaint that led
>> >> to my implementing RCU_FAST_NO_HZ.  A (non-Android) embedded person was
>> >> quite annoyed (to put it mildly) at the earlier version of RCU because
>> >> it prevented the system from entering the power-saving dyntick-idle mode,
>> >> not for minutes, or even for seconds, but for a handful of -milliseconds-.
>> >> This was my first hint that "energy efficiency" means something completely
>> >> different in embedded systems than it does in the servers that I am
>> >> used to.
>> >>
>> >> But I must defer to the Android guys on this -- who knows, perhaps
>> >> multi-minute delays to enter full-suspend mode are OK for them.
>> >
>> > if the system was looking at all applications I would agree that the timeout
>> > should be much shorter.
>> >
>> > I have a couple devices that are able to have the display usable, even if
>> > the CPU is asleep (the OLPC and the Kindle, two different display
>> > technologies). With these devices I would like to see the suspend happen so
>> > fast that it can suspend between keystrokes.
>> >
>> > however, in the case of Android I think the timeouts have to end up being
>> > _much_ longer. Otherwise you have the problem of loading an untrusted book
>> > reader app on the device and the device suspends while you are reading the
>> > page.
>> >
>> > currently Android works around this by having a wakelock held whenever the
>> > display is on. This seems backwards to me, the display should be on because
>> > the system is not suspended, not the system is prevented from suspending
>> > because the display is on.
>> >
>> > Rather than having the display be on causing a wavelock to be held (with the
>> > code that is controls the display having a timeout for how long it leaves
>> > the display on), I would invert this and have the timeout be based on system
>> > activity, and when it decides the system is not active, turn off the display
>> > (along with other things as it suspends)
>>
>> IIRC, this was a major point of their (Android's) power management
>> policy.  User input of any kind would reset the "display active"
>> timeout, which is the primary thing keeping random untrusted
>> user-facing programs from being suspended while in use.  They seemed
>> to consider this to be a special case in their policy, but from the
>> kernel's point of view it is just another suspend blocker being held.
>>
>> I'm not sure this is the best use case to look at though, because
>> since it is user-facing, the timeout durations are on a different
>> scale than the ones they are really worried about.  I think another
>> category of use case that they are worried about is:
>>
>> (in suspend) -> wakeup due to network -> process network activity -> suspend
>>
>> or an example that has been mentioned previously:
>>
>> (in suspend) -> wakeup due to alarm for audio processing -> process
>> batch of audio -> suspend
>>
>> In both of these cases, the display may never power on (phone might
>> beep to indicate txt message or email, audio just keeps playing), so
>> the magnitude of the "timeout" for suspending again should be very
>> small.  Specifically, they don't want there to be a timeout at all, so
>> as little time as possible time is spent out of suspend in addition to
>> the time required to handle the event that caused wakeup.
>
> It would be good to get some sort of range for the "timeout".  In the
> audio-output case, my understanding that the spacing between bursts of
> audio-processing activity is measured in some hundreds of milliseconds,
> in which case one would want the delays until suspend to be on the
> millisecond scale.  But does Android really suspend between bursts of
> audio processing while playing music?  Very cool if so!  ;-)

Oops, yea that's actually a really bad example, that's probably
something that would be handled by low-power states. I think the
incoming text message example is a good one though. There seemed to
be a focus on user-interaction scale time scales, and I wanted to
point out that there are also very short duration time scales to
consider as well.

*back to lurking*
Kevin

>
>                                                        Thanx, Paul
>
>> >>>>>>> if the backlight being on holds the wakelock, it would seem that
>> >>>>>>> almost every other use of the wakelock could (and probably should)
>> >>>>>>> be replaced by something that tickles the display to stay on longer.
>> >>>>>>
>> >>>>>> The problem with this approach is that the display consumes quite a
>> >>>>>> bit of power, so you don't want to leave it on unnecessarily.  So if
>> >>>>>> the system is doing something (for example, playing music) that does
>> >>>>>> not require the display, you really want the display to be off.
>> >>>>>
>> >>>>> what percentage (and types) of apps are really useful with the
>> >>>>> display off. I think that there are relativly few apps that you
>> >>>>> really want to keep running if the display is off.
>> >>>>
>> >>>> The length of time those apps are running is the governing factor
>> >>>> for battery life, and not the number of such apps, right?
>> >>>
>> >>> correct, but the number of such apps indicates the scope of the problem.
>> >>
>> >> The number of such apps certainly indicates the amount of effort required
>> >> to modify them, if required.  Is that what you are getting at?
>> >
>> > yes.
>> >
>> >>>> From another e-mail tonight it sounds like almost everything
>> >>>> already talks
>> >>>
>> >>> to a userspace daemon, so if "(the power management service in the
>> >>> system_server, possibly the media_server and the radio interface
>> >>> glue)" (plus possibly some kernel activity) are the only things
>> >>> looked at when considering if it's safe to sleep or not, all of
>> >>> these can (or already do) do 'something' every few seconds, making
>> >>> this problem sound significantly smaller than it sounded like
>> >>> before.
>> >>>
>> >>> Android could even keep it's user-space API between the system power
>> >>> daemon and the rest of userspace the same if they want to.
>> >>>
>> >>> over time, additional apps could be considered 'trusted' (or flagged
>> >>> that way by the user) and not have to interact with the power daemon
>> >>> to keep things alive.
>> >>
>> >> Hmmm...  Isn't it the "trusted" (AKA PM-driving) apps that interact with
>> >> the power daemon via suspend blockers, rather than the other way around?
>> >
>> > I was looking at it from a kernel point of view, "trusted" (AKA PM-driving)
>> > apps are ones that have permission to grab the wakelock. Any app/daemon that
>> > is so trusted can communicate with anything else in userspace as part of
>> > making it's decision on whento take the wakelock, but those other
>> > applications would not qualify as "trusted" in my eyes.
>> >
>> >>> as for intramentation, the key tool to use to see why a system isn't
>> >>> going to sleep would be powertop, just like on other linux systems.
>> >>
>> >> Powertop is indeed an extremely valuable tool, but I am not certain
>> >> that it really provides the information that the Android guys need.
>> >> If I understand Arve's and Brian's posts, here is the scenario that they
>> >> are trying to detect:
>> >>
>> >> o       Some PM-driving application has a bug in which it fails to
>> >>        release a wakelock, thus blocking suspend indefinitely.
>> >>
>> >> o       This PM-driving application, otherwise being a good citizen,
>> >>        blocks.
>> >>
>> >> o       There are numerous power-oblivious apps running, consuming
>> >>        significant CPU.
>> >>
>> >> What the Android developers need to know is that the trusted application
>> >> is wrongly holding a wakelock.  Won't powertop instead tell them about
>> >> all the power-oblivious apps?
>> >
>> > in my proposal (without a wakelock), powertop would tell you what
>> > applications are running and setting timers. If we can modify the
>> > kernel/suspend decision code to only look at processes in one cgroup when
>> > deciding if the system should go to sleep, a similar modification to
>> > poewrtop should let you only show stats on the "trusted" applications.
>> >
>> > If you have a userspace power management daemon that accepts requests from
>> > untrusted programs and does something to keep the system from sleeping
>> > (either taking a wakelock or setting a 'short' timer), it needs to keep the
>> > records of this itself because otherwise all the kernel will see (with
>> > either powertop or wakelock reporting) is that the power management daemon
>> > is what kept the system from sleeping.
>> >
>> > David Lang
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > Please read the FAQ at  http://www.tux.org/lkml/
>> >
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/