Re: [RFC PATCH 0/5] sched: cpu parked and push current task mechanism

From: Shrikanth Hegde
Date: Mon Jun 02 2025 - 00:26:08 EST




Hi.


----------------------------

vCPU - Virtual CPUs - CPU in VM world.
pCPU - Physical CPUs - CPU in baremetal world.

A hypervisor is managing these vCPUs from different VMs. When a vCPU requests for CPU, hypervisor does the job
of scheduling them on a pCPU.

So this issue occurs when there are more vCPUs(combined across all VMs) than the pCPU. So when *all* vCPUs are
requesting for CPUs, hypervisor can only run a few of them and remaining will be preempted(waiting for pCPU).


If we take two VM's, When hypervisor preempts vCPU from VM1 to run vCPU from VM2, it has to do
save/restore VM context.Instead if VM's can co-ordinate among each other and request for *limited*  vCPUs,
it avoids the above overhead and there is context switching within vCPU(less expensive). Even if hypervisor
is preempting one vCPU to run another withing the same VM, it is still more expensive than the task preemption within
the vCPU. So *basic* aim to avoid vCPU preemption.


So to achieve this, use this parking(we need better name for sure) concept, where it is better
if workloads avoid some vCPUs at this moment. (vCPUs stays online, we don't want the overhead of sched domain rebuild).


contention is dynamic in nature. When there is contention for pCPU is to be detected and determined
by architecture. Archs needs to update the mask regularly.

When there is contention, use limited vCPUs as indicated by arch.
When there is no contention, use all vCPUs.


I hope this helped to set the problem context. I am trying to get feedback if the approach makes sense.
I will go through other push mechanism we have (example in rt/dl).