Re: Unusually high system CPU usage with recent kernels

From: Tibor Billes
Date: Fri Aug 30 2013 - 10:47:03 EST


> From: Paul E. McKenney Sent: 08/30/13 03:24 AM
> To: Tibor Billes
> Subject: Re: Unusually high system CPU usage with recent kernels
>
> On Tue, Aug 27, 2013 at 10:05:42PM +0200, Tibor Billes wrote:
> > > From: Paul E. McKenney Sent: 08/26/13 06:28 AM
> > > On Sun, Aug 25, 2013 at 09:50:21PM +0200, Tibor Billes wrote:
> > > > From: Paul E. McKenney Sent: 08/24/13 11:03 PM
> > > > > On Sat, Aug 24, 2013 at 09:59:45PM +0200, Tibor Billes wrote:
> > > > > > From: Paul E. McKenney Sent: 08/22/13 12:09 AM
> > > > > > > On Wed, Aug 21, 2013 at 11:05:51PM +0200, Tibor Billes wrote:
> > > > > > > > > From: Paul E. McKenney Sent: 08/21/13 09:12 PM
> > > > > > > > > On Wed, Aug 21, 2013 at 08:14:46PM +0200, Tibor Billes wrote:
> > > > > > > > > > > From: Paul E. McKenney Sent: 08/20/13 11:43 PM
> > > > > > > > > > > On Tue, Aug 20, 2013 at 10:52:26PM +0200, Tibor Billes wrote:
> > > > > > > > > > > > > From: Paul E. McKenney Sent: 08/20/13 04:53 PM
> > > > > > > > > > > > > On Tue, Aug 20, 2013 at 08:01:28AM +0200, Tibor Billes wrote:
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I was using the 3.9.7 stable release and tried to upgrade to the 3.10.x series.
> > > > > > > > > > > > > > The 3.10.x series was showing unusually high (>75%) system CPU usage in some
> > > > > > > > > > > > > > situations, making things really slow. The latest stable I tried is 3.10.7.
> > > > > > > > > > > > > > I also tried 3.11-rc5, they both show this behaviour. This behaviour doesn't
> > > > > > > > > > > > > > show up when the system is idling, only when doing some CPU intensive work,
> > > > > > > > > > > > > > like compiling with multiple threads. Compiling with only one thread seems not
> > > > > > > > > > > > > > to trigger this behaviour.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > To be more precise I did a `perf record -a` while compiling a large C++ program
> > > > > > > > > > > > > > with scons using 4 threads, the result is appended at the end of this email.
> > > > > > > > > > > > >
> > > > > > > > > > > > > New one on me! You are running a mainstream system (x86_64), so I am
> > > > > > > > > > > > > surprised no one else noticed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Could you please send along your .config file?
> > > > > > > > > > > >
> > > > > > > > > > > > Here it is
> > > > > > > > > > >
> > > > > > > > > > > Interesting. I don't see RCU stuff all that high on the list, but
> > > > > > > > > > > the items I do see lead me to suspect RCU_FAST_NO_HZ, which has some
> > > > > > > > > > > relevance to the otherwise inexplicable group of commits you located
> > > > > > > > > > > with your bisection. Could you please rerun with CONFIG_RCU_FAST_NO_HZ=n?
> > > > > > > > > > >
> > > > > > > > > > > If that helps, there are some things I could try.
> > > > > > > > > >
> > > > > > > > > > It did help. I didn't notice anything unusual when running with CONFIG_RCU_FAST_NO_HZ=n.
> > > > > > > > >
> > > > > > > > > Interesting. Thank you for trying this -- and we at least have a
> > > > > > > > > short-term workaround for this problem. I will put a patch together
> > > > > > > > > for further investigation.
> > > > > > > >
> > > > > > > > I don't specifically need this config option so I'm fine without it in
> > > > > > > > the long term, but I guess it's not supposed to behave like that.
> > > > > > >
> > > > > > > OK, good, we have a long-term workload for your specific case,
> > > > > > > even better. ;-)
> > > > > > >
> > > > > > > But yes, there are situations where RCU_FAST_NO_HZ needs to work
> > > > > > > a bit better. I hope you will bear with me with a bit more
> > > > > > > testing...
> > > > > > >
> > > > > > > > > In the meantime, could you please tell me how you were measuring
> > > > > > > > > performance for your kernel builds? Wall-clock time required to complete
> > > > > > > > > one build? Number of builds completed per unit time? Something else?
> > > > > > > >
> > > > > > > > Actually, I wasn't all this sophisticated. I have a system monitor
> > > > > > > > applet on my top panel (using MATE, Linux Mint), four little graphs,
> > > > > > > > one of which shows CPU usage. Different colors indicate different kind
> > > > > > > > of CPU usage. Blue shows user space usage, red shows system usage, and
> > > > > > > > two more for nice and iowait. During a normal compile it's almost
> > > > > > > > completely filled with blue user space CPU usage, only the top few
> > > > > > > > pixels show some iowait and system usage. With CONFIG_RCU_FAST_NO_HZ
> > > > > > > > set, about 3/4 of the graph was red system CPU usage, the rest was
> > > > > > > > blue. So I just looked for a pile of red on my graphs when I tested
> > > > > > > > different kernel builds. But also compile speed was horrible I couldn't
> > > > > > > > wait for the build to finish. Even the UI got unresponsive.
> > > > > > >
> > > > > > > We have been having problems with CPU accounting, but this one looks
> > > > > > > quite real.
> > > > > > >
> > > > > > > > Now I did some measuring. In the normal case a compile finished in 36
> > > > > > > > seconds, compiled 315 object files. Here are some output lines from
> > > > > > > > dstat -tasm --vm during compile:
> > > > > > > > ----system---- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----swap--- ------memory-usage----- -----virtual-memory----
> > > > > > > >   time   |usr sys idl wai hiq siq| read Âwrit| recv Âsend| Âin  out | int  csw | used Âfree| used Âbuff Âcach Âfree|majpf minpf alloc Âfree
> > > > > > > > 21-08 21:48:05| 91  8  2  0  0  0|  0 Â5852k|  0   0 |  0   0 |1413 Â1772 |  0 Â7934M| 581M 58.0M Â602M 6553M|  0  Â71k  46k  54k
> > > > > > > > 21-08 21:48:06| 93  6  1  0  0  0|  0 Â2064k| 137B Â131B|  0   0 |1356 Â1650 |  0 Â7934M| 649M 58.0M Â604M 6483M|  0  Â72k  47k  28k
> > > > > > > > 21-08 21:48:07| 86 Â11  4  0  0  0|  0 Â5872k|  0   0 |  0   0 |2000 Â2991 |  0 Â7934M| 577M 58.0M Â627M 6531M|  0  Â99k  67k  79k
> > > > > > > > 21-08 21:48:08| 87  9  3  0  0  0|  0 Â2840k|  0   0 |  0   0 |2558 Â4164 |  0 Â7934M| 597M 58.0M Â632M 6507M|  0  Â96k  57k  51k
> > > > > > > > 21-08 21:48:09| 93  7  1  0  0  0|  0 Â3032k|  0   0 |  0   0 |1329 Â1512 |  0 Â7934M| 641M 58.0M Â626M 6469M|  0  Â61k  48k  39k
> > > > > > > > 21-08 21:48:10| 93  6  0  0  0  0|  0 Â4984k|  0   0 |  0   0 |1160 Â1146 |  0 Â7934M| 572M 58.0M Â628M 6536M|  0  Â50k  40k  57k
> > > > > > > > 21-08 21:48:11| 86  9  6  0  0  0|  0 Â2520k|  0   0 |  0   0 |2947 Â4760 |  0 Â7934M| 605M 58.0M Â631M 6500M|  0  103k  55k  45k
> > > > > > > > 21-08 21:48:12| 90  8  2  0  0  0|  0 Â2840k|  0   0 |  0   0 |2674 Â4179 |  0 Â7934M| 671M 58.0M Â635M 6431M|  0  Â84k  59k  42k
> > > > > > > > 21-08 21:48:13| 90  9  1  0  0  0|  0 Â4656k|  0   0 |  0   0 |1223 Â1410 |  0 Â7934M| 643M 58.0M Â638M 6455M|  0  Â90k  62k  68k
> > > > > > > > 21-08 21:48:14| 91  8  1  0  0  0|  0 Â3572k|  0   0 |  0   0 |1432 Â1828 |  0 Â7934M| 647M 58.0M Â641M 6447M|  0  Â81k  59k  57k
> > > > > > > > 21-08 21:48:15| 91  8  1  0  0  0|  0 Â5116k| 116B  Â0 |  0   0 |1194 Â1295 |  0 Â7934M| 605M 58.0M Â644M 6487M|  0  Â69k  54k  64k
> > > > > > > > 21-08 21:48:16| 87 Â10  3  0  0  0|  0 Â5140k|  0   0 |  0   0 |1761 Â2586 |  0 Â7934M| 584M 58.0M Â650M 6502M|  0  105k  64k  68k
> > > > > > > >
> > > > > > > > The abnormal case compiled only 182 object file in 6 and a half minutes,
> > > > > > > > then I stopped it. The same dstat output for this case:
> > > > > > > > ----system---- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----swap--- ------memory-usage----- -----virtual-memory----
> > > > > > > >   time   |usr sys idl wai hiq siq| read Âwrit| recv Âsend| Âin  out | int  csw | used Âfree| used Âbuff Âcach Âfree|majpf minpf alloc Âfree
> > > > > > > > 21-08 22:10:49| 27 Â62 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | 210B Â Â0 | Â 0 Â Â 0 |1414 Â3137k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â1628 Â1250 Â 322
> > > > > > > > 21-08 22:10:50| 25 Â60 Â 4 Â 0 Â11 Â 0| Â 0 Â Â88k| 126B Â Â0 | Â 0 Â Â 0 |1337 Â3110k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â Â91 Â 128 Â 115
> > > > > > > > 21-08 22:10:51| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â 184k| 294B Â Â0 | Â 0 Â Â 0 |1411 Â3147k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â1485 Â 814 Â 815
> > > > > > > > 21-08 22:10:52| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | 437B Â239B| Â 0 Â Â 0 |1355 Â3160k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â Â24 Â Â94 Â Â97
> > > > > > > > 21-08 22:10:53| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | 168B Â Â0 | Â 0 Â Â 0 |1397 Â3155k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â 479 Â 285 Â 273
> > > > > > > > 21-08 22:10:54| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â4096B| 396B Â324B| Â 0 Â Â 0 |1346 Â3154k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â Â27 Â 145 Â 145
> > > > > > > > 21-08 22:10:55| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â60k| Â 0 Â Â 0 | Â 0 Â Â 0 |1353 Â3148k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â93 Â 117 Â Â36
> > > > > > > > 21-08 22:10:56| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1341 Â3172k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â 158 Â Â87 Â Â74
> > > > > > > > 21-08 22:10:57| 26 Â62 Â 1 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â42B Â 60B| Â 0 Â Â 0 |1332 Â3162k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â56 Â Â82 Â Â78
> > > > > > > > 21-08 22:10:58| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1334 Â3178k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â26 Â Â56 Â Â56
> > > > > > > > 21-08 22:10:59| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1336 Â3179k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â 3 Â Â33 Â Â32
> > > > > > > > 21-08 22:11:00| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â24k| Â90B Â108B| Â 0 Â Â 0 |1347 Â3172k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â41 Â Â73 Â Â71
> > > > > > > >
> > > > > > > > I have four logical cores so 25% makes up 1 core. I don't know if the ~26% user CPU usage has anthing to do with this fact or just coincidence. The rest is ~63% system and ~11% hardware interrupt. Do these support what you suspect?
> > > > > > >
> > > > > > > The massive increase in context switches does come as a bit of a surprise!
> > > > > > > It does rule out my initial suspicion of lock contention, but then again
> > > > > > > the fact that you have only four CPUs made that pretty unlikely to begin
> > > > > > > with.
> > > > > > >
> > > > > > > 2.4k average context switches in the good case for the full run vs. 3,156k
> > > > > > > for about half of a run in the bad case. That is an increase of more
> > > > > > > than three orders of magnitude!
> > > > > > >
> > > > > > > Yow!!!
> > > > > > >
> > > > > > > Page faults are actually -higher- in the good case. You have about 6.5GB
> > > > > > > free in both cases, so you are not running out of memory. Lots more disk
> > > > > > > writes in the good case, perhaps consistent with its getting more done.
> > > > > > > Networking is negligible in both cases.
> > > > > > >
> > > > > > > Lots of hardware interrupts in the bad case as well. Would you be willing
> > > > > > > to take a look at /proc/interrupts before and after to see which one you
> > > > > > > are getting hit with? (Or whatever interrupt tracking tool you prefer.)
> > > > > >
> > > > > > Here are the results.
> > > > > >
> > > > > > Good case before:
> > > > > > Â Â Â Â Â CPU0 Â Â Â CPU1 Â Â Â CPU2 Â Â Â CPU3 Â Â Â
> > > > > > Â0:     17     Â0     Â0     Â0  IO-APIC-edge   Âtimer
> > > > > > Â1:    Â356     Â1     68     Â4  IO-APIC-edge   Âi8042
> > > > > > Â8:     Â0     Â0     Â1     Â0  IO-APIC-edge   Ârtc0
> > > > > > Â9:    Â330     14    Â449     71  IO-APIC-fasteoi  acpi
> > > > > > 12:     10    Â108    Â269    2696  IO-APIC-edge   Âi8042
> > > > > > 16:     36     10    Â111     Â2  IO-APIC-fasteoi  ehci_hcd:usb1
> > > > > > 17:     20     Â3     25     Â4  IO-APIC-fasteoi  mmc0
> > > > > > 21:     Â3     Â0     34     Â0  IO-APIC-fasteoi  ehci_hcd:usb2
> > > > > > 40:     Â0     Â1     12     11  PCI-MSI-edge   Âmei_me
> > > > > > 41:   Â10617    Â173    9959    Â292  PCI-MSI-edge   Âahci
> > > > > > 42:    Â862     11    Â186     26  PCI-MSI-edge   Âxhci_hcd
> > > > > > 43:    Â107     77     27    Â102  PCI-MSI-edge   Âi915
> > > > > > 44:    5322     20    Â434     22  PCI-MSI-edge   Âiwlwifi
> > > > > > 45:    Â180     Â0    Â183     86  PCI-MSI-edge   Âsnd_hda_intel
> > > > > > 46:     Â0     Â3     Â0     Â0  PCI-MSI-edge   Âeth0
> > > > > > NMI: Â Â Â Â Â1 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Non-maskable interrupts
> > > > > > LOC: Â Â Â16312 Â Â Â15177 Â Â Â10840 Â Â Â 8995 Â Local timer interrupts
> > > > > > SPU: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Spurious interrupts
> > > > > > PMI: Â Â Â Â Â1 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Performance monitoring interrupts
> > > > > > IWI: Â Â Â 1160 Â Â Â Â523 Â Â Â 1031 Â Â Â Â481 Â IRQ work interrupts
> > > > > > RTR: Â Â Â Â Â3 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â APIC ICR read retries
> > > > > > RES: Â Â Â14976 Â Â Â16135 Â Â Â 9973 Â Â Â10784 Â Rescheduling interrupts
> > > > > > CAL: Â Â Â Â482 Â Â Â Â457 Â Â Â Â151 Â Â Â Â370 Â Function call interrupts
> > > > > > TLB: Â Â Â Â 70 Â Â Â Â106 Â Â Â Â352 Â Â Â Â230 Â TLB shootdowns
> > > > > > TRM: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Thermal event interrupts
> > > > > > THR: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Threshold APIC interrupts
> > > > > > MCE: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Machine check exceptions
> > > > > > MCP: Â Â Â Â Â2 Â Â Â Â Â2 Â Â Â Â Â2 Â Â Â Â Â2 Â Machine check polls
> > > > > > ERR: Â Â Â Â Â0
> > > > > > MIS: Â Â Â Â Â0
> > > > > >
> > > > > > Good case after:
> > > > > > Â Â Â Â Â CPU0 Â Â Â CPU1 Â Â Â CPU2 Â Â Â CPU3 Â Â Â
> > > > > > Â0:     17     Â0     Â0     Â0  IO-APIC-edge   Âtimer
> > > > > > Â1:    Â367     Â1     81     Â4  IO-APIC-edge   Âi8042
> > > > > > Â8:     Â0     Â0     Â1     Â0  IO-APIC-edge   Ârtc0
> > > > > > Â9:    Â478     14    Â460     71  IO-APIC-fasteoi  acpi
> > > > > > 12:     10    Â108    Â269    2696  IO-APIC-edge   Âi8042
> > > > > > 16:     36     10    Â111     Â2  IO-APIC-fasteoi  ehci_hcd:usb1
> > > > > > 17:     20     Â3     25     Â4  IO-APIC-fasteoi  mmc0
> > > > > > 21:     Â3     Â0     34     Â0  IO-APIC-fasteoi  ehci_hcd:usb2
> > > > > > 40:     Â0     Â1     12     11  PCI-MSI-edge   Âmei_me
> > > > > > 41:   Â16888    Â173    9959    Â292  PCI-MSI-edge   Âahci
> > > > > > 42:    1102     11    Â186     26  PCI-MSI-edge   Âxhci_hcd
> > > > > > 43:    Â107    Â132     27    Â136  PCI-MSI-edge   Âi915
> > > > > > 44:    6943     20    Â434     22  PCI-MSI-edge   Âiwlwifi
> > > > > > 45:    Â180     Â0    Â183     86  PCI-MSI-edge   Âsnd_hda_intel
> > > > > > 46:     Â0     Â3     Â0     Â0  PCI-MSI-edge   Âeth0
> > > > > > NMI: Â Â Â Â Â4 Â Â Â Â Â3 Â Â Â Â Â3 Â Â Â Â Â3 Â Non-maskable interrupts
> > > > > > LOC: Â Â Â26845 Â Â Â24780 Â Â Â19025 Â Â Â17746 Â Local timer interrupts
> > > > > > SPU: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Spurious interrupts
> > > > > > PMI: Â Â Â Â Â4 Â Â Â Â Â3 Â Â Â Â Â3 Â Â Â Â Â3 Â Performance monitoring interrupts
> > > > > > IWI: Â Â Â 1637 Â Â Â Â751 Â Â Â 1287 Â Â Â Â695 Â IRQ work interrupts
> > > > > > RTR: Â Â Â Â Â3 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â APIC ICR read retries
> > > > > > RES: Â Â Â26511 Â Â Â26673 Â Â Â18791 Â Â Â20194 Â Rescheduling interrupts
> > > > > > CAL: Â Â Â Â510 Â Â Â Â480 Â Â Â Â151 Â Â Â Â370 Â Function call interrupts
> > > > > > TLB: Â Â Â Â361 Â Â Â Â292 Â Â Â Â575 Â Â Â Â461 Â TLB shootdowns
> > > > > > TRM: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Thermal event interrupts
> > > > > > THR: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Threshold APIC interrupts
> > > > > > MCE: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Machine check exceptions
> > > > > > MCP: Â Â Â Â Â2 Â Â Â Â Â2 Â Â Â Â Â2 Â Â Â Â Â2 Â Machine check polls
> > > > > > ERR: Â Â Â Â Â0
> > > > > > MIS: Â Â Â Â Â0
> > > > > >
> > > > > > Bad case before:
> > > > > > Â Â Â Â Â CPU0 Â Â Â CPU1 Â Â Â CPU2 Â Â Â CPU3 Â Â Â
> > > > > > Â0:     17     Â0     Â0     Â0  IO-APIC-edge   Âtimer
> > > > > > Â1:    Â172     Â3     78     Â3  IO-APIC-edge   Âi8042
> > > > > > Â8:     Â0     Â1     Â0     Â0  IO-APIC-edge   Ârtc0
> > > > > > Â9:    1200    Â148    Â395     81  IO-APIC-fasteoi  acpi
> > > > > > 12:    1625     Â2    Â348     10  IO-APIC-edge   Âi8042
> > > > > > 16:     26     23    Â115     Â0  IO-APIC-fasteoi  ehci_hcd:usb1
> > > > > > 17:     16     Â3     12     21  IO-APIC-fasteoi  mmc0
> > > > > > 21:     Â2     Â2     33     Â0  IO-APIC-fasteoi  ehci_hcd:usb2
> > > > > > 40:     Â0     Â0     14     10  PCI-MSI-edge   Âmei_me
> > > > > > 41:   Â15776    Â374    8497    Â687  PCI-MSI-edge   Âahci
> > > > > > 42:    1297    Â829    Â115     24  PCI-MSI-edge   Âxhci_hcd
> > > > > > 43:    Â103    Â149     Â9    Â212  PCI-MSI-edge   Âi915
> > > > > > 44:   Â13151    Â101    Â511     91  PCI-MSI-edge   Âiwlwifi
> > > > > > 45:    Â153    Â159     Â0    Â122  PCI-MSI-edge   Âsnd_hda_intel
> > > > > > 46:     Â0     Â1     Â1     Â0  PCI-MSI-edge   Âeth0
> > > > > > NMI: Â Â Â Â 32 Â Â Â Â 31 Â Â Â Â 31 Â Â Â Â 31 Â Non-maskable interrupts
> > > > > > LOC: Â Â Â82504 Â Â Â82732 Â Â Â74172 Â Â Â75985 Â Local timer interrupts
> > > > > > SPU: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Spurious interrupts
> > > > > > PMI: Â Â Â Â 32 Â Â Â Â 31 Â Â Â Â 31 Â Â Â Â 31 Â Performance monitoring interrupts
> > > > > > IWI: Â Â Â17816 Â Â Â16278 Â Â Â13833 Â Â Â13282 Â IRQ work interrupts
> > > > > > RTR: Â Â Â Â Â3 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â APIC ICR read retries
> > > > > > RES: Â Â Â18784 Â Â Â21084 Â Â Â13313 Â Â Â12946 Â Rescheduling interrupts
> > > > > > CAL: Â Â Â Â393 Â Â Â Â422 Â Â Â Â306 Â Â Â Â356 Â Function call interrupts
> > > > > > TLB: Â Â Â Â231 Â Â Â Â176 Â Â Â Â235 Â Â Â Â191 Â TLB shootdowns
> > > > > > TRM: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Thermal event interrupts
> > > > > > THR: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Threshold APIC interrupts
> > > > > > MCE: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Machine check exceptions
> > > > > > MCP: Â Â Â Â Â3 Â Â Â Â Â3 Â Â Â Â Â3 Â Â Â Â Â3 Â Machine check polls
> > > > > > ERR: Â Â Â Â Â0
> > > > > > MIS: Â Â Â Â Â0
> > > > > >
> > > > > > Bad case after:
> > > > > > Â Â Â Â Â CPU0 Â Â Â CPU1 Â Â Â CPU2 Â Â Â CPU3 Â Â Â
> > > > > > Â0:     17     Â0     Â0     Â0  IO-APIC-edge   Âtimer
> > > > > > Â1:    Â415     Â3     85     Â3  IO-APIC-edge   Âi8042
> > > > > > Â8:     Â0     Â1     Â0     Â0  IO-APIC-edge   Ârtc0
> > > > > > Â9:    1277    Â148    Â428     81  IO-APIC-fasteoi  acpi
> > > > > > 12:    1625     Â2    Â348     10  IO-APIC-edge   Âi8042
> > > > > > 16:     26     23    Â115     Â0  IO-APIC-fasteoi  ehci_hcd:usb1
> > > > > > 17:     16     Â3     12     21  IO-APIC-fasteoi  mmc0
> > > > > > 21:     Â2     Â2     33     Â0  IO-APIC-fasteoi  ehci_hcd:usb2
> > > > > > 40:     Â0     Â0     14     10  PCI-MSI-edge   Âmei_me
> > > > > > 41:   Â17814    Â374    8497    Â687  PCI-MSI-edge   Âahci
> > > > > > 42:    1567    Â829    Â115     24  PCI-MSI-edge   Âxhci_hcd
> > > > > > 43:    Â103    Â177     Â9    Â242  PCI-MSI-edge   Âi915
> > > > > > 44:   Â14956    Â101    Â511     91  PCI-MSI-edge   Âiwlwifi
> > > > > > 45:    Â153    Â159     Â0    Â122  PCI-MSI-edge   Âsnd_hda_intel
> > > > > > 46:     Â0     Â1     Â1     Â0  PCI-MSI-edge   Âeth0
> > > > > > NMI: Â Â Â Â 36 Â Â Â Â 35 Â Â Â Â 34 Â Â Â Â 34 Â Non-maskable interrupts
> > > > > > LOC: Â Â Â92429 Â Â Â92708 Â Â Â81714 Â Â Â84071 Â Local timer interrupts
> > > > > > SPU: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Spurious interrupts
> > > > > > PMI: Â Â Â Â 36 Â Â Â Â 35 Â Â Â Â 34 Â Â Â Â 34 Â Performance monitoring interrupts
> > > > > > IWI: Â Â Â22594 Â Â Â19658 Â Â Â17439 Â Â Â14257 Â IRQ work interrupts
> > > > > > RTR: Â Â Â Â Â3 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â APIC ICR read retries
> > > > > > RES: Â Â Â21491 Â Â Â24670 Â Â Â14618 Â Â Â14569 Â Rescheduling interrupts
> > > > > > CAL: Â Â Â Â441 Â Â Â Â439 Â Â Â Â306 Â Â Â Â356 Â Function call interrupts
> > > > > > TLB: Â Â Â Â232 Â Â Â Â181 Â Â Â Â274 Â Â Â Â465 Â TLB shootdowns
> > > > > > TRM: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Thermal event interrupts
> > > > > > THR: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Threshold APIC interrupts
> > > > > > MCE: Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Â Â Â Â0 Â Machine check exceptions
> > > > > > MCP: Â Â Â Â Â3 Â Â Â Â Â3 Â Â Â Â Â3 Â Â Â Â Â3 Â Machine check polls
> > > > > > ERR: Â Â Â Â Â0
> > > > > > MIS: Â Â Â Â Â0
> > > > >
> > > > > Lots more local timer interrupts, which is consistent with the higher
> > > > > time in interrupt handlers for the bad case.
> > > > >
> > > > > > > One hypothesis is that your workload and configuration are interacting
> > > > > > > with RCU_FAST_NO_HZ to force very large numbers of RCU grace periods.
> > > > > > > Could you please check for this by building with CONFIG_RCU_TRACE=y,
> > > > > > > mounting debugfs somewhere and dumping rcu/rcu_sched/rcugp before and
> > > > > > > after each run?
> > > > > >
> > > > > > Good case before:
> > > > > > completed=8756 Âgpnum=8757 Âage=0 Âmax=21
> > > > > > after:
> > > > > > completed=14686 Âgpnum=14687 Âage=0 Âmax=21
> > > > > >
> > > > > > Bad case before:
> > > > > > completed=22970 Âgpnum=22971 Âage=0 Âmax=21
> > > > > > after:
> > > > > > completed=26110 Âgpnum=26111 Âage=0 Âmax=21
> > > > >
> > > > > In the good case, (14686-8756)/40=148.25 grace periods per second, which
> > > > > is a fast but reasonable rate given your HZ=250. Not a large difference
> > > > > in the number of grace periods, but extrapolating for the longer runtime,
> > > > > maybe ten times as much. But not much change in grace-period rate per
> > > > > unit time.
> > > > >
> > > > > > The test scenario was the following in both cases (mixed english and pseudo-bash):
> > > > > > reboot, login, start terminal
> > > > > > cd project
> > > > > > rm -r build
> > > > > > cat /proc/interrupts >> somefile ; cat /sys/kernel/debug/rcu/rcu_sched/rcugp >> somefile
> > > > > > scons -j4
> > > > > > wait ~40 sec (good case finished, Ctrl-C in bad case)
> > > > > > cat /proc/interrupts >> somefile ; cat /sys/kernel/debug/rcu/rcu_sched/rcugp >> somefile
> > > > > >
> > > > > > I stopped the build in the bad case after about the same time the good
> > > > > > case finished, so the extra interrupts and RCU grace periods because of the
> > > > > > longer runtime don't fake the results.
> > > > >
> > > > > That procedure works for me, thank you for laying it out carefully.
> > > > >
> > > > > I believe I see what is going on and how to fix it, though it may take
> > > > > me a bit to work things through and get a good patch.
> > > > >
> > > > > Thank you very much for your testing efforts!
> > > >
> > > > I'm glad I can help. I've been using Linux for many years, now I have a
> > > > chance to help the community, to do something in return. I'm quite
> > > > enjoying this :)
> > >
> > > ;-)
> > >
> > > Here is a patch that is more likely to help. I am testing it in parallel,
> > > but figured I should send you a sneak preview.
> >
> > I tried it, but I don't see any difference in overall performance. The dstat
> > also shows the same as before.
> >
> > But I did notice something. Occasionally there is an increase in userspace
> > CPU usage, interrupts and context switches are dropping, and it really gets
> > more work done (scons printed commands more frequently). ÂI checked that
> > this behaviour is present without your patch, I just didn't notice this
> > before. Maybe you can make some sense out of it.
> >
> > ----system---- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----swap--- ------memory-usage----- -----virtual-memory----
> >   time   |usr sys idl wai hiq siq| read Âwrit| recv Âsend| Âin  out | int  csw | used Âfree| used Âbuff Âcach Âfree|majpf minpf alloc Âfree
> > 27-08 20:51:53| 23 Â62 Â 5 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1274 Â3102k| Â 0 Â7934M| 549M 56.0M Â491M 6698M| Â 0 Â Â28 Â 156 Â 159
> > 27-08 20:51:54| 24 Â64 Â 1 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1317 Â3165k| Â 0 Â7934M| 549M 56.0M Â491M 6698M| Â 0 Â Â53 Â 189 Â 182
> > 27-08 20:51:55| 33 Â50  6  2  9  0| 192k 1832k|  0   0 |  0   0 |1371 Â2442k|  0 Â7934M| 544M 56.0M Â492M 6702M|  0  Â30k  17k  17k
> > 27-08 20:51:56| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1313 Â3220k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â21 Â 272 Â 232
> > 27-08 20:51:57| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1319 Â3226k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â 8 Â Â96 Â 112
> > 27-08 20:51:58| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1317 Â3224k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â12 Â 145 Â 141
> > 27-08 20:51:59| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1317 Â3223k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â54 Â 193 Â 191
> > 27-08 20:52:00| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â24k| Â 0 Â Â 0 | Â 0 Â Â 0 |1336 Â3216k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â36 Â 161 Â 172
> > 27-08 20:52:01| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1313 Â3225k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â 9 Â 107 Â 107
> > 27-08 20:52:02| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1327 Â3224k| Â 0 Â7934M| 545M 56.0M Â492M 6701M| Â 0 Â Â13 Â 193 Â 200
> > 27-08 20:52:03| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1311 Â3226k| Â 0 Â7934M| 545M 56.0M Â492M 6701M| Â 0 Â Â13 Â 114 Â 114
> > 27-08 20:52:04| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1331 Â3223k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â41 Â 190 Â 178
> > 27-08 20:52:05| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â8192B| Â 0 Â Â 0 | Â 0 Â Â 0 |1315 Â3222k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â30 Â 123 Â 122
> > 27-08 20:52:06| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1314 Â3223k| Â 0 Â7934M| 544M 56.0M Â492M 6701M| Â 0 Â Â16 Â 187 Â 195
> > 27-08 20:52:07| 25 Â63 Â 1 Â 0 Â12 Â 0|2212k Â192k| Â 0 Â Â 0 | Â 0 Â Â 0 |1637 Â3194k| Â 0 Â7934M| 544M 56.2M Â494M 6699M| Â 0 Â1363 Â2590 Â1947
> > 27-08 20:52:08| 17 Â33 Â18 Â26  6  0|3208k  Â0 |  0   0 |  0   0 |1351 Â1766k|  0 Â7934M| 561M 56.3M Â499M 6678M|  4  Â10k 7620 Â2055
> > 27-08 20:52:09| 47 Â21 Â16 Â13  4  0|4332k Â628k|  0   0 |  0   0 |1400 Â1081k|  0 Â7934M| 647M 56.3M Â504M 6587M| Â10  Â24k  25k 1151
> > 27-08 20:52:10| 36 Â34 Â13 Â11  6  0|2636k 2820k|  0   0 |  0   0 |1451 Â1737k|  0 Â7934M| 598M 56.3M Â507M 6632M|  5  Â19k  16k  28k
> > 27-08 20:52:11| 46 Â17 Â10 Â25 Â 3 Â 0|4288k Â536k| Â 0 Â Â 0 | Â 0 Â Â 0 |1386 Â 868k| Â 0 Â7934M| 613M 56.3M Â513M 6611M| Â24 Â Â13k 8908 Â3616
> > 27-08 20:52:12| 53 Â33  5  4  5  0|4740k 3992k|  0   0 |  0   0 |1773 Â1464k|  0 Â7934M| 562M 56.6M Â521M 6654M|  0  Â36k  29k  40k
> > 27-08 20:52:13| 60 Â34  0  1  6  0|4228k 1416k|  0   0 |  0   0 |1642 Â1670k|  0 Â7934M| 593M 56.6M Â526M 6618M|  0  Â36k  26k  17k
> > 27-08 20:52:14| 53 Â37 Â 1 Â 3 Â 7 Â 0|3008k 1976k| Â 0 Â Â 0 | Â 0 Â Â 0 |1513 Â1972k| Â 0 Â7934M| 541M 56.6M Â529M 6668M| Â 0 Â Â10k 9986 Â Â23k
> > 27-08 20:52:15| 55 Â34  1  4  6  0|3636k 1284k|  0   0 |  0   0 |1645 Â1688k|  0 Â7934M| 581M 56.6M Â535M 6622M|  0  Â43k  30k  18k
> > 27-08 20:52:16| 57 Â30  5  2  5  0|4404k 2320k|  0   0 |  0   0 |1715 Â1489k|  0 Â7934M| 570M 56.6M Â541M 6627M|  0  Â39k  24k  26k
> > 27-08 20:52:17| 50 Â35  3  7  6  0|2520k 1972k|  0   0 |  0   0 |1699 Â1598k|  0 Â7934M| 587M 56.6M Â554M 6596M|  0  Â65k  40k  32k
> > 27-08 20:52:18| 52 Â40  2  1  7  0|1556k 1732k|  0   0 |  0   0 |1582 Â1865k|  0 Â7934M| 551M 56.6M Â567M 6619M|  0  Â35k  26k  33k
> > 27-08 20:52:19| 26 Â62 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1351 Â3240k| Â 0 Â7934M| 551M 56.6M Â568M 6619M| Â 0 Â Â86 Â 440 Â 214
> > 27-08 20:52:20| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â 108k| Â 0 Â Â 0 | Â 0 Â Â 0 |1392 Â3162k| Â 0 Â7934M| 555M 56.6M Â560M 6623M| Â 0 Â1801 Â1490 Â2672
> > 27-08 20:52:21| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1332 Â3198k| Â 0 Â7934M| 555M 56.6M Â560M 6623M| Â 0 Â Â50 Â 245 Â 255
> > 27-08 20:52:22| 25 Â63 Â 1 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1350 Â3220k| Â 0 Â7934M| 556M 56.6M Â560M 6622M| Â 0 Â 755 Â 544 Â 286
> > 27-08 20:52:23| 27 Â62 Â 1 Â 0 Â11 Â 0| Â 0 Â 272k| Â 0 Â Â 0 | Â 0 Â Â 0 |1323 Â3092k| Â 0 Â7934M| 551M 56.6M Â558M 6628M| Â 0 Â 341 Â1464 Â3085
> > 27-08 20:52:24| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1334 Â3197k| Â 0 Â7934M| 551M 56.6M Â559M 6627M| Â 0 Â Â63 Â 514 Â 273
> > 27-08 20:52:25| 25 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â40k| Â 0 Â Â 0 | Â 0 Â Â 0 |1329 Â3243k| Â 0 Â7934M| 546M 56.6M Â558M 6633M| Â 0 Â 321 Â 160 Â1679
> > 27-08 20:52:26| 39 Â50  1  1  9  0| Â48k Â644k|  0   0 |  0   0 |1500 Â2556k|  0 Â7934M| 552M 56.6M Â560M 6625M|  0  Â30k  14k  12k
> > 27-08 20:52:27| 26 Â62 Â 1 Â 0 Â11 Â 0| Â 0 Â 192k| Â 0 Â Â 0 | Â 0 Â Â 0 |1380 Â3152k| Â 0 Â7934M| 553M 56.6M Â560M 6624M| Â 0 Â2370 Â 808 Â 718
> > 27-08 20:52:28| 23 Â55 Â12 Â 0 Â10 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1247 Â2993k| Â 0 Â7934M| 553M 56.6M Â561M 6624M| Â 0 Â1060 Â 428 Â 241
> > 27-08 20:52:29| 25 Â63 Â 1 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1318 Â3142k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â 663 Â 442 Â 198
> > 27-08 20:52:30| 25 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â 100k| Â 0 Â Â 0 | Â 0 Â Â 0 |1316 Â3212k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â Â42 Â 187 Â 186
> > 27-08 20:52:31| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â4096B| Â 0 Â Â 0 | Â 0 Â Â 0 |1309 Â3222k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â Â 9 Â Â85 Â Â85
> > 27-08 20:52:32| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1297 Â3219k| Â 0 Â7934M| 554M 56.6M Â561M 6623M| Â 0 Â Â23 Â Â95 Â Â84
> > 27-08 20:52:33| 25 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1339 Â3101k| Â 0 Â7934M| 555M 56.6M Â557M 6625M| Â 0 Â 923 Â1566 Â2176
> > 27-08 20:52:34| 25 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1333 Â3095k| Â 0 Â7934M| 555M 56.6M Â559M 6623M| Â 0 Â 114 Â 701 Â 195
> > 27-08 20:52:35| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1342 Â3036k| Â 0 Â7934M| 557M 56.6M Â568M 6613M| Â 0 Â 957 Â3225 Â 516
> > 27-08 20:52:36| 26 Â60 Â 2 Â 0 Â11 Â 0| Â 0 Â Â28k| Â 0 Â Â 0 | Â 0 Â Â 0 |1340 Â3091k| Â 0 Â7934M| 555M 56.6M Â568M 6614M| Â 0 Â5304 Â5422 Â5609
> > 27-08 20:52:37| 25 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1348 Â3260k| Â 0 Â7934M| 556M 56.6M Â565M 6617M| Â 0 Â Â30 Â 156 Â1073
> > 27-08 20:52:38| 24 Â64 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1314 Â3211k| Â 0 Â7934M| 556M 56.6M Â549M 6633M| Â 0 Â Â11 Â 105 Â4285
> > 27-08 20:52:39| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1353 Â3031k| Â 0 Â7934M| 558M 56.6M Â559M 6620M| Â 0 Â 847 Â3866 Â 357
> > 27-08 20:52:40| 26 Â63 Â 0 Â 0 Â12 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1309 Â3135k| Â 0 Â7934M| 569M 56.6M Â566M 6602M| Â 0 Â3940 Â5727 Â1288
>
> Interesting. The number of context switches drops during the time
> that throughput improves. It would be good to find out what task(s)
> are doing all of the context switches. One way to find the pids of the
> top 20 context-switching tasks should be something like this:
>
> grep ctxt /proc/*/status | sort -k2nr | head -20
>
> You could use any number of methods to map back to the command.
> When generating my last patch, I was assuming that ksoftirqd would be
> the big offender. Of course, if it is something else, I should be
> taking a different approach.

I'll measure it after the weekend.

Tibor
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/