Re: [PATCH 2/3] x86: x2apic/cluster: Make use of lowest prioritydelivery mode

From: Alexander Gordeev
Date: Mon May 21 2012 - 10:48:34 EST


On Mon, May 21, 2012 at 02:40:26PM +0200, Ingo Molnar wrote:
> But that is not 'perfectly balanced' in many cases.
>
> When the hardware round-robins the interrupts then each
> interrupt will go to a 'cache cold' CPU in essence. This is
> pretty much the worst thing possible thing to do in most cases:
> while it's "perfectly balanced" in the sense of distributing
> cycles evenly between CPUs, each interrupt handler execution
> will generate an avalance of cachemisses, for cachelines there
> were modified in the previous invocation of the irq.

Absolutely.

There are at least two more offenders :) exercising lowest priority + logical
addressing similarly. So in this regard the patch is nothing new:


static inline unsigned int
default_cpu_mask_to_apicid(const struct cpumask *cpumask)
{
return cpumask_bits(cpumask)[0] & APIC_ALL_CPUS;
}

static unsigned int summit_cpu_mask_to_apicid(const struct cpumask *cpumask)
{
unsigned int round = 0;
int cpu, apicid = 0;

/*
* The cpus in the mask must all be on the apic cluster.
*/
for_each_cpu(cpu, cpumask) {
int new_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);

if (round && APIC_CLUSTER(apicid) != APIC_CLUSTER(new_apicid)) {
printk("%s: Not a valid mask!\n", __func__);
return BAD_APICID;
}
apicid |= new_apicid;
round++;
}
return apicid;
}

> One notable exception is when the CPUs are SMT/Hyperthreading
> siblings, in that case they are sharing even the L1 cache, so
> there's very little cost to round-robining the IRQs within the
> CPU mask.
>
> But AFAICS irqbalanced will spread irqs on wider masks than SMT
> sibling boundaries, exposing us to the above performance
> problem.

I would speculate it is irqbalanced who should be (in case of x86) cluster-
agnostic and ask for a mask while the apic layer is just execute or at least
report what was set. But that is a different topic.

Considering a bigger picture, it appears strange to me that apic is the layer
to take decision whether to make CPU a target or not. It is especially true
when one means a particluar cpumask, wants it to be set, but still is not able
to do that due to the current limitation.

> So I think we need to tread carefully here.

Kernel parameter? IRQ line flag? Totally opposed? :)

> Thanks,

>
> Ingo

--
Regards,
Alexander Gordeev
agordeev@xxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/