Re: [patch] genapic: optimize & fix APIC mode setup

From: Ingo Molnar
Date: Mon Nov 13 2006 - 10:05:47 EST



* Andi Kleen <ak@xxxxxxx> wrote:

> On the two CPU case it is basically always the same anyways because
> the loop is very cheap compared to the IPIs (IPIs tend to be thousands
> of cycles). [...]

the basic thing that you are missing is that some of the most important
IPI deliveries are /asynchronous/! For those it does not matter at all
that the IPI delivery itself might be thousands of cycles ...
Programming the IPI itself is cheap.

> for_each_cpu_mask is essentially just BSF with some glue.

yes - it's a minimum of 2 BSF scans, and that isnt exactly the cheapest
of instructions. It should be tens of cycles or more, combined with the
nonzero cost of passing a cpumask (which is 32 bytes) into the function
by value ...

and your argument again, is to hide a hotplug bug via extra cost (and by
causing additional regressions) and not fix the hotplug bug? Your
position makes no sense to me.

> For the > 2 CPU case it is not that obvious -- in theory the hardware
> could optimize it to be more efficient, but it doesn't seem to. Or at
> least not in a good enough way to show significant differences.

how did you measure it?

> > you are still not getting it i think. The IO-APIC is still in
> > logical delivery mode on small systems,
>
> The IO-APIC delivery mode that is configured comes from genapic, and
> should be physical for physflat unless I'm totally confused about the
> code.

indeed, you are right. Btw., this is another change to io_apic.c that i
would never have ACKed ;-) This whole thing is hidden via something that
looks like a macro value:

entry.delivery_mode = INT_DELIVERY_MODE;

but is defined behind the curtain as:

#define INT_DELIVERY_MODE (genapic->int_delivery_mode)

same goes for TARGET_CPUS:

#define TARGET_CPUS (genapic->target_cpus())

ugh!

physical delivery is bad on the IO-APIC anyway - today LowestPrio
delivery mode works again and the hardware can help us. In physical mode
there's no LowestPrio delivery mode...

> > the right solution is to use pure physical mode (both local APIC and
> > IO-APIC) only on large systems, and to use pure logical mode on
> > small systems - maybe with the combination of clustered mode as
> > well.
>
> I disagree. I think we should just use physical mode everywhere,
> except on the old i386 systems.

you are risking the introduction of the kind of regressions that i'm
seeing on my box: that physical delivery mystically doesnt work
occasionally. AFAIK Windows defaults to logical APIC delivery too on
small systems. (it in fact uses the TPR IIRC which can only work with
logical delivery)

So your suggestion would put us up to use an uncommon (and more
expensive ...) mode of IPI delivery - instead of limiting that rare
delivery mode to large SMP systems only ...

and again, please remind me of what you are trying to argue: that we
should keep the slower and less common delivery mode just to not have to
fix the hotplug bug?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/