Re: Plumbers: Tweaking scheduler policy micro-conf RFP

From: Peter Zijlstra
Date: Tue May 15 2012 - 07:58:21 EST


On Tue, 2012-05-15 at 14:35 +0300, Pantelis Antoniou wrote:
>
> Throughput: MIPS(?), bogo-mips(?), some kind of performance counter?

Throughput is too generic a term to put a unit on. For some people its
tnx/s for others its frames/s neither are much (if at all) related to
MIPS (database tnx require lots of IO, video encoding likes FPU/SIMMD
stuff etc..).

> Latency: usecs(?)

nsec (chips are really really fast and only getting faster), but nsecs
of what :-) That is, which latency are we going to measure.

> Power: Now that's a tricky one, we can't measure power directly, it's a
> function of the cpu load we run in a period of time, along with any
> history of the cstates & pstates of that period. How can we collect
> information about that? Also we to take into account peripheral device
> power to that; GPUs are particularly power hungry.

Intel provides some measure of CPU power drain on recent chips (iirc),
but yeah that doesn't include GPUs and other peripherals iirc.

> Thermal management: How to distribute load to the processors in such
> a way that the temperature of the die doesn't increase too much that
> we have to either go to a lower OPP or shut down the core all-together.
> This is in direct conflict with throughput since we'd have better performance
> if we could keep the same warmed-up cpu going.

Core-hopping.. yay! We have the whole sensors framework that provides an
interface to such hardware, the question is, do chips have enough
sensors spread on them to be useful?

> Memory I/O: Some workloads are memory bandwidth hungry but do not need
> much CPU power. In the case of asymmetric cores it would make sense to move
> the memory bandwidth hog to a lower performance CPU without any impact.
> Probably need to use some kind of performance counter for that; not going
> to be very generic.

You're assuming the slower cores have the same memory bandwidth, isn't
that a dangerous assumption?

Anyway, so the 'problem' with using PMCs from within the scheduler is
that, 1) they're ass backwards slow on some chips (x86 anyone?) 2) some
userspace gets 'upset' if they can't get at all of them.

So it has to be optional at best, and I hate knobs :-) Also, the more
information you're going to feed this load-balancer thing, the harder
all that becomes, you don't want to do the full nm! m-dimensional bin
fit.. :-)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/