Re: [RFC v3 00/12] DRM scheduling cgroup controller

From: Tvrtko Ursulin
Date: Thu Jan 26 2023 - 13:16:47 EST



On 26/01/2023 17:57, Tvrtko Ursulin wrote:
On 26/01/2023 17:04, Tejun Heo wrote:

driver folks think about the current RFC tho. Is at least AMD on board with
the approach?

Yes I am keenly awaiting comments from the DRM colleagues as well.

Forgot to mention one thing on this point which may interest AMD.

Some time ago I tested the super primitive "throttling via lowering the scheduling priority" on a GuC based i915 GPU, so only three supported priority levels, and FWIW it can be somewhat effective.

It certainly was effective for my main use case which is "run this GPU workload in the background while I use the GPU for something else".

The actual test was along the lines of running a GPU hog in parallel to an interactive client which can measure dropped frames.

With equal drm.weights the interactive client was seeing ~10 (i915 pre-GuC) or ~27 (i915 GuC) dropped frames per second (60 fps target). With the GPU hog drm.weight lowered to 1:10 that dropped to ~3 dropped frames per second (all 3 before the over budget condition was noticed by the controller).

Main take here is that improved user experience is possible even with this primitive throttling method and even on GPUs which support only three scheduling priority levels.

Although main thing still is that individual drivers are completely free to improve their method of handling to the over budget signal. Nothing in the controller itself should be precluding that.

Regards,

Tvrtko