Re: [PATCH] irqchip/gic-v3-its: Balance initial LPI affinity across CPUs

From: John Garry
Date: Mon Jan 20 2020 - 13:21:08 EST


On 20/01/2020 17:42, Marc Zyngier wrote:

Hi Marc,

static u64 its_irq_get_msi_base(struct its_device *its_dev)
@@ -2773,28 +2829,34 @@ static int its_irq_domain_activate(struct
irq_domain *domain,
{
struct its_device *its_dev = irq_data_get_irq_chip_data(d);
u32 event = its_get_event_id(d);
- const struct cpumask *cpu_mask = cpu_online_mask;
- int cpu;
+ int ret = 0, cpu = nr_cpu_ids;
+ const struct cpumask *reqmask;
+ cpumask_var_t mask;
- /* get the cpu_mask of local node */
- if (its_dev->its->numa_node >= 0)
- cpu_mask = cpumask_of_node(its_dev->its->numa_node);
+ if (irqd_affinity_is_managed(d))
+ reqmask = irq_data_get_affinity_mask(d);
+ else
+ reqmask = cpu_online_mask;
- /* Bind the LPI to the first possible CPU */
- cpu = cpumask_first_and(cpu_mask, cpu_online_mask);
- if (cpu >= nr_cpu_ids) {
- if (its_dev->its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_23144)
- return -EINVAL;
+ if (!alloc_cpumask_var(&mask, GFP_KERNEL))
+ return -ENOMEM;
- cpu = cpumask_first(cpu_online_mask);
+ its_compute_affinity(d, reqmask, mask);
+ cpu = its_pick_target_cpu(mask);
+ if (cpu >= nr_cpu_ids) {
+ ret = -EINVAL;
+ goto out;
}
+ atomic_inc(per_cpu_ptr(&cpu_lpi_count, cpu));

I wonder if we should only consider managed interrupts in this
accounting?

So cpu0 is effectively going to be excluded from the balancing, as it
will have so many lpis targeted.

Maybe, but only if the provided managed affinity gives you the
opportunity of placing the LPI somewhere else.

Of course, if there's no other cpu in the mask then so be it.

If the managed
affinity says CPU0 only, then that's where you end up.


If my debug code is correct (with the above fix), cpu0 had 763 interrupts targeted on my D06 initially :)

But it's not just cpu0. I find initial non-managed interrupt affinity masks are set generally on cpu cluster/numa node masks, so the first cpus in those masks are bit over-subscribed, so then we may be spreading the managed interrupts over less cpus in the mask.

This is a taste of lpi distribution on my 96 core system:
cpu0 763
cpu1 2
cpu3 1
cpu4 2
cpu5 2
cpu6 0
cpu7 0
cpu8 2
cpu9 1
cpu10 0
...
cpu16 2
...
cpu24 8
...
cpu48 10 (numa node boundary)
...


And, for the others, even if we balance all the LPIs, won't irqbalance
(if running, obviously) can come along and fiddle with these
non-managed interrupt affinities anyway?

Of course, irqbalance will move things around. But that should be to
CPUs that do not have too many screaming interrupts.

Thanks,

M.

Cheers,
John