[patch] softirq performance fixes, cleanups, 2.4.10.

From: Ingo Molnar (mingo@elte.hu)
Date: Wed Sep 26 2001 - 11:44:03 EST


the Linux softirq code still has a number of performance and latency
problems as of 2.4.10.

one issue is that there are still places in the kernel that disable/enable
softirq processing, but do not restart softirqs. This creates softirq
processing latencies, which can show up eg. as 'stuttering' packet
processing. Longer latencies between hard interrupts and soft interrupt
processing also decreases caching efficiency - if eg. a socket buffer was
touched in a network driver, it might get dropped from the cache by the
time the skb is processed by its softirq handler.

another problem is increased scheduling and softirq handling overhead due
to ksoftirqd, and related performance degradation in high-speed network
environments. (Performance drops of more than 10% were reported with
certain gigabit cards.) Under various multi-process networking loads
ksoftirqd is very active.

the attached softirq-2.4.10-A5 patch solves these two main problems and
also cleans up softirq.c.

main changes in softirq handling:

 - softirq handling can now be restarted N times within do_softirq(), if a
   softirq gets reactivated while it's being handled.

 - implemented a new scheduler mechanizm, 'unwakeup()', to undo ksoftirqd
   wakeups if softirqs happen to be fully handled before ksoftirqd runs.
   (unwakeup() does not touch the runqueue lock if the task in question is
   already running.)

 - cpu_raise_softirq() used to wake ksoftirqd up - instead of handling
   softirqs immediately. All softirq users are using __cpu_raise_softirq()
   now, and have to call rerun_softirqs() after the softirq-atomic section
   has finished.

none of these changes results in any change of tasklet or bottom-half
semantics.

the HTTP load situation i tested shows the following changes in scheduling
frequency:

                         context switches per second
                         (measured over a period of 10 seconds,
                          repeated 10 times and averaged.)

 2.4.10-vanilla: 39299

 2.4.10-softirq-A6: 35552

a 10.5% improvement. HTTP performance increased by 2%, but the system had
idle time left. Kernels with the softirq-A6 patch applied show almost no
ksoftirqd activity, while vanilla 2.4.10 shows frequent ksoftirqd
activation.

other fixes/cleanups to softirq.c:

 - removed 'mask' handling from do_softirq() - it's unnecessery due to the
   restarts. this further simplifies the code.

 - tasklet_hi_schedule() and tasklet_lo_schedule() are now rerunning
   softirqs, instead of just kicking ksoftirqd.

 - removed raise_softirq() and cpu_raise_softirq(), they are not used by
   any other code anymore. unexported them.

 - simplified argument passing between spawn_ksoftirqd() and ksoftirqd(),
   passing an argument by pointer and waiting for ksoftirqd tasks to start
   up is unnecessery.

 - it's unnecessary to spin scheduling in ksoftirqd() startup, waiting for
   the process to migrate - it's enough to call schedule() once, the
   scheduler will not run the task on the wrong CPU.

 - '[ksoftirqd_CPU0]' is confusing on UP systems, changed it to
   '[ksoftirqd]' instead.

 - simplified ksoftirqd()'s loop, it's both shorter and faster by a few
   instructions now.

 - __netif_schedule() is using __cpu_raise_softirq(), instead of
   cpu_raise_softirq() [which did not restart softirq handling, it only
   woke ksoftirqd up].

 - dev_kfree_skb_irq() ditto. (although this function is mostly called
   from IRQ contexts, where softirq restarts are not possible - but the
   IRQ code will restart them nevertheless, on IRQ exit.)

 - the generic definition of __cpu_raise_softirq() used to override
   any lowlevel definitions done in asm/softirq.h. It's now conditional so
   the architecture definitions should actually be used.

i've tested the patch both on UP and SMP systems, and saw no problems at
all. The changes decrease the size of softirq object code by ~8%. Network
packet handling appears to be smoother. (this is subjective, it's hard to
measure it). Ben, does this patch fix gigabit performance in your test, or
is still something else going on as well?

(The patch also applies cleanly to the -ac tree.)

        Ingo



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 30 2001 - 21:00:46 EST