Re: [PATCH 5/5] cpuidle-haltpoll: fix up the branch check

From: Zhenzhong Duan
Date: Tue Nov 05 2019 - 01:53:04 EST



On 2019/11/4 23:01, Marcelo Tosatti wrote:
On Mon, Nov 04, 2019 at 11:10:25AM +0800, Zhenzhong Duan wrote:
On 2019/11/2 5:26, Marcelo Tosatti wrote:
On Sat, Oct 26, 2019 at 11:23:59AM +0800, Zhenzhong Duan wrote:
Ensure pool time is longer than block_ns, so there is a margin to
avoid vCPU get into block state unnecessorily.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@xxxxxxxxxx>
---
drivers/cpuidle/governors/haltpoll.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c
index 4b00d7a..59eadaf 100644
--- a/drivers/cpuidle/governors/haltpoll.c
+++ b/drivers/cpuidle/governors/haltpoll.c
@@ -81,9 +81,9 @@ static void adjust_poll_limit(struct cpuidle_device *dev, unsigned int block_us)
u64 block_ns = block_us*NSEC_PER_USEC;
/* Grow cpu_halt_poll_us if
- * cpu_halt_poll_us < block_ns < guest_halt_poll_us
+ * cpu_halt_poll_us <= block_ns < guest_halt_poll_us
*/
- if (block_ns > dev->poll_limit_ns && block_ns <= guest_halt_poll_ns) {
+ if (block_ns >= dev->poll_limit_ns && block_ns < guest_halt_poll_ns) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If block_ns == guest_halt_poll_ns, you won't allow dev->poll_limit_ns to
grow. Why is that?
Maybe I'm too strict here. My understanding is: if block_ns = guest_halt_poll_ns,
dev->poll_limit_ns will grow to guest_halt_poll_ns,
OK.

then block_ns = dev->poll_limit_ns,
block_ns = dev->poll_limit_ns = guest_halt_poll_ns. OK.

there is not a margin to ensure poll time is enough to cover the equal block time.
In this case, shrinking may be a better choice?
Ok, so you are considering _on the next_ halt instance, if block_ns =
guest_halt_poll_ns again?
Yes, I realized it's rarely to happen in nanosecond granularity.

Then without the suggested modification: we don't shrink, poll for
guest_halt_poll_ns again.

With your modification: we shrink, because block_ns ==
guest_halt_poll_ns.

IMO what really clarifies things here is either the real sleep pattern
or a synthetic sleep pattern similar to the real thing.
Agree

Do you have a scenario where the current algorithm is maintaining
a low dev->poll_limit_ns and performance is hurt?

If you could come up with examples, such as the client/server pair at
https://lore.kernel.org/lkml/20190514135022.GD4392@xxxxxxxx/T/

or just a sequence of delays:
block_ns, block_ns, block_ns-1,...

It would be easier to visualize this.

Looks hard to generate a sequence of delays of same value in nanoseconds which is also CPU cycle granularity.

I think this patch doesn't help much for a real scenario, so pls ignore it.

Thanks

Zhenzhong