[RT] should pm_qos_resume_latency_us on one CPU affect latency on another?

From: Chris Friesen
Date: Tue Aug 13 2019 - 17:04:52 EST



Hi all,

Just wondering if what I'm seeing is expected. I'm using the CentOS 7 RT kernel with boot args of "skew_tick=1 irqaffinity=0 rcu_nocbs=1-27 nohz_full=1-27" among others.

Normally if I run cyclictest it sets /dev/cpu_dma_latency to zero. This gives worst-case latency around 6usec.

If I set /dev/cpu_dma_latency to something large and then set /sys/devices/system/cpu/cpu${num}/power/pm_qos_resume_latency_us to "2" for the CPUs that cyclictest is running on then the worst-case latency jumps to more like 16usec.

If I set pm_qos_resume_latency_us to "2" for all CPUs on the system, then the worst-case latency comes back down. It's not sufficient to set it for all CPUs on the same socket as cyclictest.

It does not seem to make any difference in the worst-case latency to set cpuset.sched_load_balance to zero for the cpuset containing cyclictest. (All cpusets but one have cpuset.sched_load_balance set to zero, and that one doesn't include the CPUs that cyclictest runs on.)

Looking at the latency traces, there does not appear to be any single culprit. I've seen cases where it appears to take extra time in migrate_task_rq_fair(), tick_do_update_jiffies64(), rcu_irq_enter(), and enqueue_entity().

I'm trying to dynamically isolate CPUs from the system for running RT tasks, but it seems like the rest of the system still affects the isolated CPUs.

Any comments/suggestions would be appreciated.

Thanks,
Chris