Re: [PATCH V4 7/9] cpuidle/powernv: Add "Fast-Sleep" CPU idle state

From: Preeti U Murthy
Date: Mon Dec 02 2013 - 10:10:43 EST


Hi Thomas,

On 11/29/2013 08:09 PM, Thomas Gleixner wrote:
> On Fri, 29 Nov 2013, Preeti U Murthy wrote:
>> +static enum hrtimer_restart handle_broadcast(struct hrtimer *hrtimer)
>> +{
>> + struct clock_event_device *bc_evt = &bc_timer;
>> + ktime_t interval, next_bc_tick, now;
>> +
>> + now = ktime_get();
>> +
>> + if (!restart_broadcast(bc_evt))
>> + return HRTIMER_NORESTART;
>> +
>> + interval = ktime_sub(bc_evt->next_event, now);
>> + next_bc_tick = get_next_bc_tick();
>
> So you're seriously using a hrtimer to poll in HZ frequency for
> updates of bc->next_event?
>
> To be honest, this design sucks.
>
> First of all, why is this a PPC specific feature? There are probably
> other architectures which could make use of this. So this should be
> implemented in the core code to begin with.
>
> And a lot of the things you need for this are already available in the
> core in one form or the other.
>
> For a start you can stick the broadcast hrtimer to the cpu which does
> the timekeeping. The handover in the hotplug case is handled there as
> well as is the handover for the NOHZ case.
>
> This needs to be extended for this hrtimer broadcast thingy to work,
> but it shouldn't be that hard to do so.
>
> Now for the polling. That's a complete trainwreck.
>
> This can be solved via the broadcast IPI as well. When a CPU which
> goes down into deep idle sets the broadcast to expire earlier than the
> active value it can denote that and send the timer broadcast IPI over
> to the CPU which has the honour of dealing with this.
>
> This supports HIGHRES and NO_HZ if done right, without polling at
> all. So you can even let the last CPU which handles the broadcast
> hrtimer go for a long sleep, just not in the deepest idle state.

Thank you for the review. The above points are all valid. I will rework
the design to:

1. Eliminate the concept of a broadcast CPU and integrate its
functionality in the timekeeping CPU.

2. Avoid polling by using IPIs to communicate the next wakeup of the
CPUs in deep idle state so as to reprogram the broadcast hrtimer.

3. Make this feature generic and not arch-specific.

Regards
Preeti U Murthy
>
> Thanks,
>
> tglx
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@xxxxxxxxxxxxxxxx
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/