RE: >10% performance degradation since 2.6.18

From: Chetan . Loke
Date: Mon Jul 06 2009 - 17:59:14 EST


> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx
> [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] On Behalf Of
> Daniel J Blueman
> Sent: Sunday, July 05, 2009 7:01 AM
> To: Matthew Wilcox; Andi Kleen
> Cc: Linux Kernel; Jens Axboe; Arjan van de Ven
> Subject: Re: >10% performance degradation since 2.6.18
>
> On Jul 3, 9:10 pm, Arjan van de Ven <ar...@xxxxxxxxxxxxx> wrote:
> > On Fri, 3 Jul 2009 21:54:58 +0200
> >
> > Andi Kleen <a...@xxxxxxxxxxxxxx> wrote:
> > > > That would seem to be a fruitful avenue of investigation --
> > > > whether limiting the cards to a single RX/TX interrupt would be
> > > > advantageous, or whether spreading the eight interrupts
> out over
> > > > the CPUs would be advantageous.
> >
> > > The kernel should really do the per cpu binding of MSIs
> by default.
> >
> > ... so that you can't do power management on a per socket basis?
> > hardly a good idea.
> >
> > just need to use a new enough irqbalance and it will spread out the
> > interrupts unless your load is low enough to go into low power mode.
>
> I was finding newer kernels (>~2.6.24) would set the
> Redirection Hint bit in the MSI address vector, allowing the
> processors to deliver the interrupt to the lowest interrupt
> priority (eg idle, no powersave) core
> (http://www.intel.com/Assets/PDF/manual/253668.pdf pp10-66)
> and older irqbalance daemons would periodically naively
> rewrite the bitmask of cores, delivering the interrupt to a
> static one.
>
> Thus, it may be worth checking if disabling any older
> irqbalance daemon gives any win.
>
> Perhaps there is value in writing different subsets of cores
> to the MSI address vector core bitmask (with the redirection
> hint enabled) for different I/O queues on heavy interrupt
> sources? By default, it's all cores.
>

Possible enhancement -

1) Drain the responses in the xmit_frame() path. That is, post the TX-request() and just before returning see if there are
any more responses in the RX-queue. This will minimize(only if the NIC f/w coalesces) interrupt load.
The n/w core should drain the responses rather than calling the drain-routine from the adapter's xmit_frame() handler. This way there won't be any need to
modify individual xmit_frame handlers.


PS - I'm not familiar with the networking code.



Chetan Loke--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/