Re: [PATCH] Re: Negative scalability by removal of

From: Andrew Morton (andrewm@uow.edu.au)
Date: Mon Nov 06 2000 - 09:27:07 EST


Andrea Arcangeli wrote:
>
> On Sat, Nov 04, 2000 at 09:22:58AM -0800, Linus Torvalds wrote:
> > We don't need to backport of the full exclusive wait queues: we could do
> > the equivalent of the semaphore inside the kernel around just accept(). It
> > wouldn't be a generic thing, but it would fix the specific case of
> > accept().
>
> The first wake-one patch floating around was against 2.2.x waitqueues and
> it's a very simple patch and it fixes the problem (it also gives LIFO
> to accept with the downside that it needs to do an O(N) browse on the
> waitqueue before doing the exclusive wakeup compared to 2.4.x that does
> the wake-one task selection in O(1) if everybody is sleeping in accept, but it
> does that FIFO unfortunately).

Andrea, the 2.4 wakeup code is only O(1) if called from process
context. For networking, where it is called from softirq context
it is O(N).

The waitqueue patch which I did for kumon@fujitsu does O(1) from
all contexts as well as LIFO and the results are fairly clear.

Here the 'server' just forks a bunch of children which sit
in accept(). The 'client' opens and closes the connection
as fast as it can, 10,000 times. All processes run on localhost:

2.4.0-test10-SMP on UP
======================

Servers Unpatched Patched
                        (secs) (secs)

  1 6 6
  2 6 6
  3 6 6
  4 6 6
  100 7 6
  200 7 6
  400 9 6
  600 10 6
  1000 13 6.6

2.4.0-test10-SMP on SMP (2 x x86)
=================================

Servers Unpatched Patched
  1 1.1 1.1
  2 1.2 1.3
  3 1.2 1.3
  4 1.2 1.3
  100 1.8 1.3
  200 2.3 1.3
  400 3.2 1.3
  600 4.1 1.3
  1000 6.0 1.3

Easy to explain. 5 seconds / 1000 servers / 10,000 iterations
is 500nSecs per loop in __wake_up_common. I figure that loop
has three misses and probably a TLB reload.

However the Apache performance was increased by 10% irrespective
of the number of servers - from 3 to 150. Presumably the effect
of the LIFO.

Frankly, the fact that we fall from 9100 conn/sec to 5500 when
there are 100 processes waiting on a socket doesn't seem
particularly important to me - this is a mistuned server.
I suggest that we not sweat it.

But the patch is a better implementation, so expect to hear more
about it when the time is appropriate :) (It currently does
LIFO for semaphores as well which, as you have explained is
wrong. Will fix.)

> The real problem that DaveM knows well is that TCP in 2.2.x will end doing
> three wakeups every time the socket moves from LISTEN to ESTABLISHED state, so
> it was really doing a wake-three not a wake-one :). So the brainer part
> is to fix TCP and not the scheduler/waitqueue part.

hmm.. Has this been changed in 2.4? The performance is the same.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Nov 07 2000 - 21:00:19 EST