Re: [PATCH v2 1/1] x86,sched: On AMD EPYC set freq_max = max_boost in schedutil invariant formula

From: Giovanni Gherdovich
Date: Tue Jan 26 2021 - 13:04:42 EST


On Mon, 2021-01-25 at 11:06 +0100, Peter Zijlstra wrote:
> On Fri, Jan 22, 2021 at 09:40:38PM +0100, Giovanni Gherdovich wrote:
> > 1. PROBLEM DESCRIPTION (over-utilization and schedutil)
> >
> > The problem happens on CPU-bound workloads spanning a large number of cores.
> > In this case schedutil won't select the maximum P-State. Actually, it's
> > likely that it will select the minimum one.
> >
> > A CPU-bound workload puts the machine in a state generally called
> > "over-utilization": an increase in CPU speed doesn't result in an increase of
> > capacity. The fraction of time tasks spend on CPU becomes constant regardless
> > of clock frequency (the tasks eat whatever we throw at them), and the PELT
> > invariant util goes up and down with the frequency (i.e. it's not invariant
> > anymore).
> > v5.10 v5.11-rc4
> > ~~~~~~~~~~~~~~~~~~~~~~~~
> > CPU activity (mpstat) 80-90% 80-90%
> > schedutil requests (tracepoint) always P0 mostly P2
> > CPU frequency (HW feedback) ~2.2 GHz ~1.5 GHz
> > PELT root rq util (tracepoint) ~825 ~450
> >
> > mpstat shows that the workload is CPU-bound and usage doesn't change with
>
> So I'm having trouble with calling a 80%-90% workload CPU bound, because
> clearly there's a ton of idle time.

Yes you're right. There is considerable idle time and calling it CPU-bound is
a bit of a stretch.

Yet I don't think I'm completely off the mark. The busy time is the same with
the machine running at 1.5 GHz and at 2.2 GHz (it just takes longer to
finish). To me it seems like the CPU is the bottleneck, with some overhead on
top.

I will confirm what causes the idle time.


Giovanni