Though I could see some gains in overcommit, but it hurted undercommit
in some workloads :(.

The gcc 4.4.7 compiler that I used in my test machine has the tendency
of allocating stack space for variables instead of using registers when
a loop is present. So I try to avoid having loop in the fast path. Also
the count itself is rather arbitrary. For the first pass, I would like
to make thing simple. We can always enhance it once it is accepted and

Yes. agree.

I have not yet tested on bigger machine. I hope that bigger machine will
see significant undercommit improvements.

Thank for running the test. I am a bit confused about the terminology.
What exactly do undercommit and overcommit mean?

Undercommit means I meant total #vcpu < #pcpus in virtual env. so
overcommit should not be an issue in baremetal.

