Re: [RFC v5 9/9] sched/deadline: also reclaim bandwidth not used by dl tasks

From: Peter Zijlstra
Date: Mon Mar 27 2017 - 10:05:25 EST


On Fri, Mar 24, 2017 at 04:53:02AM +0100, luca abeni wrote:

> +static inline
> +void __dl_update(struct dl_bw *dl_b, s64 bw)
> +{
> + struct root_domain *rd = container_of(dl_b, struct root_domain, dl_bw);
> + int i;
> +
> + RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held(),
> + "sched RCU must be held");
> + for_each_cpu_and(i, rd->span, cpu_active_mask) {
> + struct rq *rq = cpu_rq(i);
> +
> + rq->dl.extra_bw += bw;
> + }

So this is unfortunate (and we already have one such instance).

It effectively does an for_each_online_cpu() with IRQs disabled, and on
SGI class hardware that takes _forever_.

This is also what I got stuck on trying to rewrite AC to use Tommaso's
recoverable thing. In the end I had to do a 2 stage try/commit variant.
Which ended up being a pain and I didn't finish.

I'm not saying this patch is bad, but this is something we need to thing
about.