Re: [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writesusing cluster groups

From: Cyrill Gorcunov
Date: Mon Feb 14 2011 - 10:10:18 EST


On 02/14/2011 02:45 PM, Ingo Molnar wrote:

* Cyrill Gorcunov<gorcunov@xxxxxxxxx> wrote:

In the case of x2apic cluster mode we can group IPI register writes based on the
cluster group instead of individual per-cpu destiantion messages. This reduces the
apic register writes and reduces the amount of IPI messages (in the best case we
can reduce it by a factor of 16).

With this change, microbenchmark measuring the cost of flush_tlb_others(), with
the flush tlb IPI being sent from a cpu in the socket-1 to all the logical cpus in
socket-2 (on a Westmere-EX system that has 20 logical cpus in a socket) is 3x
times better now (compared to the former 'send one-by-one' algorithm).

Pretty nice!

I have a few structural and nitpicking comments:

Thanks a lot for review, Ingo! I'll address all the nits during this week.

...

+void x2apic_init_cpu_notifier(void)
+{
+ int cpu = smp_processor_id();

+ zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
+ zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
+ BUG_ON(!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu));

Such a BUG_ON() is not particularly user friendly - and this could trigger during
CPU hotplug events, i.e. while the system is fully booted up, right?

Thanks,

Ingo

Yup is not that much friendly but it's called during system bootup,
hotplug events are handled by

+static int __cpuinit
+cluster_setup(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+ unsigned int cpu = (unsigned long)hcpu;
+ int err = 0;
+
+ switch (action) {
+ case CPU_UP_PREPARE:
+ zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
+ zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
+ if (!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu)) {
+ free_cpumask_var(per_cpu(cpus_in_cluster, cpu));
+ free_cpumask_var(per_cpu(ipi_mask, cpu));
+ err = -ENOMEM;
+ }
+ break;

so it returns -ENOMEM if failed. And btw just noted that we forgot to make
x2apic_init_cpu_notifier being in __init section.

Or I miss something?

--
Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/