Re: Linux Interrupt Latency benchmark v4 avaible -> detected irq bottleneck before do_bottom_half()

Andrea Arcangeli (arcangeli@mbox.queen.it)
Fri, 28 Aug 1998 15:29:44 +0200 (CEST)


On Thu, 27 Aug 1998, Andrea Arcangeli wrote:

>This new feature has raised that the _only_ point where the kernel is not
>interrupt friendly is before the __sti() in do_bottom_half().
>
>And here the numbers:
>
>root@dragon:/home/andrea/devel/irq-latency/lil# insmod ./lil.o irq=3 num=2000000 detect=1
>Linux Interrupt Latency benchmark - Copyright (C) 1998 Andrea Arcangeli
>detected slow IRQ at: cycle 252593, latency 1653, eip c0117b05, CPU 0
>detected slow IRQ at: cycle 254190, latency 2009, eip c0117b05, CPU 0
>detected slow IRQ at: cycle 1560602, latency 1914, eip c017c00c, CPU 0
>detected slow IRQ at: cycle 1560659, latency 2100, eip c017c00c, CPU 0
>latency cycles: mean 542, max 2100, min 542

With an hacked version of lil + patch-ikd with kernel tracer enabled I
exactly discovered what is stalling the irq. It' s the timer irq.

Here the relevant part of the trace:

------------------------------------------------------------------------

Here lil is generating the irq. The way lil generate the irq is this:

cli()
gen irq() (the time of _one_ outb())
sti()

Unfortunately while the irq remains disabled also the timer interrupt
happens and due it s major priority it' s served before.

c2850079 _end +<2663d09/3fe13c8e> (1.88) pid(226) # generating the irq and sti()

c010c0f1 do_IRQ +<d/54> (0.41) pid(226) # run the first irq
c010bfe7 do_8259A_IRQ +<f/9c> (1.76) pid(226)
c010bf00 handle_IRQ_event +<10/64> (0.51) pid(226)
c010f866 pentium_timer_interrupt +<e/d8> (0.48) pid(226) # it' s the timer irq
c0114b11 do_timer +<d/58> (1.24) pid(226)
c010bc55 enable_8259A_irq +<d/38> (0.98) pid(226)
c010c119 do_IRQ +<35/54> (0.35) pid(226) # it' s finisched irq 0
c011b8b3 do_bottom_half +<f/7c> (2.25) pid(226) # irq 0 serve bh handlers

In do_bottom_half() run from irq 0 there is the first __sti() and so
the CPU run the pending irq 3 (lil_irq).

c010c0f1 do_IRQ +<d/54> (0.41) pid(226) # second irq 3 lower priority
c010bfe7 do_8259A_IRQ +<f/9c> (1.76) pid(226)
c010bf00 handle_IRQ_event +<10/64> (0.52) pid(226)
c28500f2 _end +<2663d82/3fe13c8e> (0.64) pid(226) # lil_irq()
c28500c5 _end +<2663d55/3fe13c8e> (1.05) pid(226)
c010bc55 enable_8259A_irq +<d/38> (0.98) pid(226)
c010c119 do_IRQ +<35/54> (0.39) pid(226) # finisched lil_irq()
c011b8b3 do_bottom_half +<f/7c> (0.38) pid(226) # do_bottom_half for irq 3
c010c133 do_IRQ +<4f/54> (0.63) pid(226) # exit irq3 and return in
do_bottom_half() of irq0

Here irq 0 continue to run the pending bh handlers...

c0114793 timer_bh +<13/384> (0.81) pid(226)
c01145d2 update_wall_time +<e/50> (0.33) pid(226)
c01144f5 update_wall_time_one_tick +<d/dc> (0.62) pid(226)
c0114723 update_process_times +<f/6c> (0.46) pid(226)
c0114627 update_one_process +<13/100> (1.20) pid(226)
c0124bf4 swap_tick +<10/ac> (0.32) pid(226)
c012522d free_memory_available +<d/4c> (1.83) pid(226)
c010c133 do_IRQ +<4f/54> (0.70) pid(226) # exit irq 0

c0113c4a del_timer +<e/44> (1.10) pid(226) # return to lil kernel code
c2850226 _end +<2663eb6/3fe13c8e> (1.11) pid(226)
c0116c33 printk +<13/178> (0.88) pid(226) # lil printk that say that lil
has detected a slow irq

So I think there' s really no way to reduce the maximal irq latency (that
happens meanly _once_ after a continuous train of 200000 irq). I don' t
think it' s a good idea to allow the timer interrupt to run with interrupt
enabled.... A _maximal_ latency of ~2557 cycles on a 166Mhz P5noMMX
(15,4uSec) is a very good time I think though... ;-). The mean is 5usec on
the same hardware.

Andrea[s] Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html