Re: Bad apache perfomance wtih linux SMP

Andi Kleen (ak@muc.de)
Tue, 18 May 1999 20:25:20 +0200


On Tue, May 18, 1999 at 07:44:25PM +0200, Juergen Schmidt wrote:
> Andi Kleen wrote:
> > One culprit is most likely that the data copy for TCP sending runs completely
> > serialized. This can be fixed by doing replacing the
> >
> > skb->csum = csum_and_copy_from_user(from,
> > skb_put(skb, copy), copy, 0, &err);
> >
> > in tcp.c:tcp_do_sendmsg with
> >
> > unlock_kernel();
> > skb->csum = csum_and_copy_from_user(from,
> > skb_put(skb, copy), copy, 0, &err);
> > lock_kernel();
>
> Bingo !!!
>
> This raised performance from 270 rps to 802 rps when 64 clients were
> pulling a 4k HTML-page. Single CPU perfomance lies by 890 rps -- but the
> new numbers are just from a very short run. (BTW: NT/IIS on 4 CPUs
> deliver 840 rps :-)

Cool :)

>
> > The patch does not violate any locking requirements in the kernel, because
> > the kerne lock could have been dropped at any time anyways when the copy_from_user
> > slept to swap a page in.
>
> I'd like to here some comments from other people on this. Is this a
> proper patch or is it dangerous in any way ?
>
> Linus, Alan, would you recommend to run a machine with that patch ?

They are both away (in Finnland and at LinuxExpo)

Various more extensive versions of this idea (it is originally from Stephen
Tweedie I think) have been tested and developed by Ingo Molnar,David Miller
and others. Unfortunately some versions caused crashed, but they were
explained by compiler overoptimizations (egcs 1.1 moved some instructions
across the optimization barrier in the locking macro, which caused races
under heavy load; gcc 2.7.2 should be fine). For the tcp_do_sendmsg case
there were no problems AFAIK.

If you don't believe me :) you can verify it:

csum_and_copy_from_user
does user access
hits a page that is not present
CPU raises page not present exception
calls arch/i386/mm/fault.c:do_page_fault()
calls mm/memory.c:handle_mm_fault()
calls handle_pte_fault (same file)
calls finally do_swap_page (same file)
and there at the end is:
unlock_kernel();
return 1;
now the exception processing ends.
in the interrupt return code (arch/i386/kernel/entry.S:ret_with_reschedule)
it may switch to another process etc.

> > Work to fix all these problems is underway.
>
> Will it come into 2.2 or 2.3 only ?

I expect the work will be first tested in 2.3, and then after some time
backported to a 2.2 "enterprise" release (similar to the 2.0.30 release
which backported the 2.1 socket hash optimizations to make Linux scale
to a really huge number of sockets)

-Andi

-- 
This is like TV. I don't like TV.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/