Re: [PATCH -v3] use per cpu data for single cpu ipi calls

From: Peter Zijlstra
Date: Fri Jan 30 2009 - 11:16:36 EST


On Fri, 2009-01-30 at 08:04 -0800, Linus Torvalds wrote:

> My only question is whetherr we might even drop the kmalloc() some day:
> I suspect that the CSD_FLAG_LOCK is essentially never a contention point,
> and the cost (and occasional synchronization) of kmalloc() quite possibly
> overwhelms any theoretical scaling ability.

IIRC the recent SL*B numbers posted showed that a kmalloc could be as
cheap as ~100 cycles or something. IPIs are sadly still a bit more
expensive.

> If another CPU hasn't even received its IPI before the same CPU sends the
> next one, I'm not sure we _want_ to send one, in fact.

I think the intent was to re-route IO-completion interrupts to whatever
cpu/node issued the IO with the idea that that cpu/node has the page
hottest etc. and transferring the completion is cheaper than bouncing
the page.

Since that would be relaying hardware interrupts, there's nothing much
you can do about the rate, or something, that's up to the firmware on
$$$ scsi thing.

But Jens already said that that path was using the __ variant and
providing its own csds, the kmalloc isn't needed there, so it might all
be moot.

> But that's a secondary issue, and isn't a correctness thing, just a "do we
> really need three different allocations?" musing..

Nick, Jens, I was under the presumption that the kmalloc was needed for
something other than failing to deadlock, happen to remember what?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/