Re: [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control

From: Tvrtko Ursulin
Date: Fri Jan 27 2023 - 10:21:32 EST



On 27/01/2023 14:11, Michal Koutný wrote:
On Fri, Jan 27, 2023 at 01:31:54PM +0000, Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
I think you missed the finish_suspend_scanning() part:

if (root_drmcs.suspended_period_us)
cancel_delayed_work_sync(&root_drmcs.scan_work);

So if scanning was in progress migration will wait until it finishes.

Indeed, I've missed that. Thank you!

Not claiming I did not miss something because I was totally new with cgroup
internals when I started working on this. So it is definitely useful to have
more eyes looking.

The custom with (especially v2, especially horizontal) migrations
is that they're treated leniently to avoid performance costs.

I'm afraid waiting for scan in can_attach() can propagate globally (via
cgroup_update_dfl_csses() and cgroup_attach_lock()) sometimes.

That something along those lines might be a concern was indeed worrying me when coming up with the scheme. Good inside knowledge hint, thank you. I will have a deeper look.

OTOH, unless I misunderstood, you need to cover explicit (not task but
resource, when sending client FD around) migration anyway?

Correct. So far that was handled outside the cgroup controller in the drm layer and any lock dependency propagation was hidden behind RCU.
But that will likely change once I try your suggestion of eliminating the struct pid based client tracking and so become relevant.

(I.e. my suggestion would be to mutualy exclude scanning and explicit
migration but not scanning and task migration in order to avoid possible
global propagation.)

Thanks, I will look into this all hopefully shortly. Perhaps what you suggest will come naturally with the removal of struct pid based tracking.

Regards,

Tvrtko