Re: CPU Hotplug add/remove optimizations

From: Rohit Vaswani
Date: Fri Aug 06 2010 - 16:06:12 EST


On 8/3/2010 1:07 AM, Andi Kleen wrote:
Rohit Vaswani<rvaswani@xxxxxxxxxxxxxx> writes:

Hi,

We are trying to use cpu hotplug to turn off a cpu when it is not in
use to improve power management.
It might not be a big issue on smaller systems, but CPU hotunplug
involves stop_machine() and that is a very costly thing
to do as systems become larger.
I think that currently for users, the cpu hotplug add time is what matters more - so that the user does not experience that latency in the UI when the core comes up. So I guess we could accept the latency for CPU hotunplug for the time being because eventually it will save power.
I am trying to optimize the cpu
hotplug add and cpu hotplug remove timings. Currently cpu hotplug add
takes around 250ms and cpu hotplug remove takes 190 ms. For the
current purposes we want to assume that we are removing and adding the
same core. It seems that since we are actually not replacing the core
– there could be a lot of initialization overhead that could be
saved and restored instead of calibrating the entire core again.
One such thing we have been looking at is that once a core is powered
up during cpu hotplug add, it runs the calibrate_delay routine to
calculate the value of loops_per_jiffy. In such a case could we bypass
the calibrate_delay function and just save and restore the value of
loops_per_jiffy?
Does this approach seem wrong to anyone?
It's wrong on a system that supports socket hotplug. The CPU you're
power up again might not be the same.
Could we have a separate code path for bringing up the same core that we just hot-unplugged?
One way could be that the user can specify that it is bringing up the same core and thus the calibrate_delay function could be skipped. If a new core is being added - the code path would calibrate the core again.
Currently the calibrate_delay function takes up almost the entire 250ms of cpu hotplug-add time. Thus, if we can get rid of that function call, when we know that we are bringing up the same core - the cpu hotplug add could be almost instantaneous.
Is there a better way to accomplish this?
Are there any other issues that I may be missing in order to get this working?
In theory you could have some low level interface that distingushes
these two cases, but right now that's not there.

Can we safely assume that the core will start at the same clock speed
at which the value was stored and then restored?
That neither.

-Andi
Thanks,
Rohit Vaswani

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/