Re: [PATCH] Re: Negative scalability by removal of

From: Andrew Morton (andrewm@uow.edu.au)
Date: Mon Nov 06 2000 - 09:28:23 EST


Alan Cox wrote:
>
> > Even 2.2.x can be fixed to do the wake-one for accept(), if required.
>
> Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to
> try and backport all the mechanism. I think for 2.2 using the semaphore is a
> good approach. Its a hack to fix an old OS kernel. For 2.4 its not needed

It's a 16-liner! I'll cheerfully admit that this patch
may be completely broken, but hey, it's free. I suggest
that _something_ has to be done for 2.2 now, because
Apache has switched to unserialised accept().

More columns of numbers...

Apache connection rate. 2.2.18-pre19-UP, uniprocessor:

Servers Unpatched (conn/sec) Patched (conn/sec)
  3 371 375
  10 361 374
  30 361 374
  150 112 351

Same test as the last email - minimal connect/close to
localhost:

Servers Unpatched Patched
(waiters in accept()) conn/sec conn/sec

  1 1670 2000
  2 670 670
  3 470 470
  4 470 1670
  5 470 470
  6 1670 1670
  7 1670 1670
  8 1430 1670
  9 1430 1670
  10 1430 1670
  50 590 1430
  100 250 1250
  200 117 1110

The fact that the throughput is 3-4 time worse for 2, 3, 4 and 5
server processes is completely wierd. Perhaps some strange miss
pattern, but it doesn't do it on 2.4. I'll dump this problem
onto the netdev list, see if anyone has any bright ideas.

--- linux-2.2.18-pre19/include/linux/sched.h Sun Nov 5 11:46:54 2000
+++ linux-akpm/include/linux/sched.h Mon Nov 6 22:01:51 2000
@@ -79,6 +79,7 @@
 #define TASK_ZOMBIE 4
 #define TASK_STOPPED 8
 #define TASK_SWAPPING 16
+#define TASK_EXCLUSIVE 32
 
 /*
  * Scheduling policies
@@ -251,6 +252,7 @@
         struct task_struct *next_task, *prev_task;
         struct task_struct *next_run, *prev_run;
 
+ unsigned int task_exclusive; /* task wants wake-one semantics in __wake_up() */
 /* task state */
         struct linux_binfmt *binfmt;
         int exit_code, exit_signal;
@@ -370,6 +372,7 @@
 /* counter */ DEF_PRIORITY,DEF_PRIORITY,0, \
 /* SMP */ 0,0,0,-1, \
 /* schedlink */ &init_task,&init_task, &init_task, &init_task, \
+/* task_exclusive */ 0, \
 /* binfmt */ NULL, \
 /* ec,brk... */ 0,0,0,0,0,0, \
 /* pid etc.. */ 0,0,0,0,0, \
@@ -496,8 +499,8 @@
                                                     signed long timeout));
 extern void FASTCALL(wake_up_process(struct task_struct * tsk));
 
-#define wake_up(x) __wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE)
-#define wake_up_interruptible(x) __wake_up((x),TASK_INTERRUPTIBLE)
+#define wake_up(x) __wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE | TASK_EXCLUSIVE)
+#define wake_up_interruptible(x) __wake_up((x),TASK_INTERRUPTIBLE | TASK_EXCLUSIVE)
 
 #define __set_current_state(state_value) do { current->state = state_value; } while (0)
 #ifdef __SMP__
--- linux-2.2.18-pre19/kernel/sched.c Sun Nov 5 11:46:54 2000
+++ linux-akpm/kernel/sched.c Mon Nov 6 21:51:57 2000
@@ -892,6 +892,7 @@
 {
         struct task_struct *p;
         struct wait_queue *head, *next;
+ unsigned int done_exclusive, do_exclusive;
 
         if (!q)
                 goto out;
@@ -906,10 +907,18 @@
         if (!next)
                 goto out_unlock;
 
+ done_exclusive = 0;
+ do_exclusive = mode & TASK_EXCLUSIVE;
         while (next != head) {
                 p = next->task;
                 next = next->next;
                 if (p->state & mode) {
+ if (do_exclusive && p->task_exclusive) {
+ if (done_exclusive)
+ continue;
+ done_exclusive = 1;
+ }
+
                         /*
                          * We can drop the read-lock early if this
                          * is the only/last process.
--- linux-2.2.18-pre19/net/ipv4/tcp.c Sun Nov 5 11:46:54 2000
+++ linux-akpm/net/ipv4/tcp.c Mon Nov 6 21:26:48 2000
@@ -1619,6 +1619,7 @@
         struct wait_queue wait = { current, NULL };
         struct open_request *req;
 
+ current->task_exclusive = 1;
         add_wait_queue(sk->sleep, &wait);
         for (;;) {
                 current->state = TASK_INTERRUPTIBLE;
@@ -1632,6 +1633,8 @@
                         break;
         }
         current->state = TASK_RUNNING;
+ wmb();
+ current->task_exclusive = 0;
         remove_wait_queue(sk->sleep, &wait);
         return req;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Nov 07 2000 - 21:00:19 EST