[PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead

From: Waiman Long
Date: Wed Jan 09 2019 - 14:21:01 EST


v1: https://lkml.org/lkml/2019/1/7/899
v2: Fix a minor bug in patch 4 & update the cover-letter.

As newer systems have more and more IRQs and CPUs available in their
system, the performance of reading /proc/stat frequently is getting
worse and worse.

It appears that the idea of caching the IRQ counts in the v1 patch to
reduce the frequency of doing percpu summation and use a sysctl parameter
to control it was not well received.

I have looked into the use of percpu counters for counting interrupts.
However, the followings are the reasons why I don't think percpu counters
is the right choice for doing that.

1) There is a raw spinlock in the percpu_counter structure that may
need to be acquired in the update path. This can be a performance
drag especially if lockdep is enabled.

2) The percpu_counter structure is 40 bytes in size on 64-bit
systems compared with just 8 bytes for the percpu count pointer and
an additional 4 bytes that I introduced in patch 2 which may not
actually increase the size of the IRQ descriptor. With thousands
of irq descriptors, it can consume quite a lot more memory. Memory
consumption was a point that had been brought up in the v1 patch
review.

3) Reading the patch 4 commit log, one can see that quite a bit of CPU
cycles was spent looking up the radix tree to locate the IRQ
descriptors for each of the interrupts. Those overhead will still
be there even if I use percpu counters. So using percpu counter
alone won't be as performant as this patch or my previous v1 patch.
Patch 4 optimizes the descriptor lookup process which is independant
of the percpu counter choice.

4) Patches 2 and 3 are the patches that modify the percpu counting aspect
of the IRQ counts. The number of changed lines of code is only 14. So
they are very simple changes.

This new patch optimizes the way the IRQ counts are retrieved and getting
rid of the sysctl parameter altogether to achieve a performance gain
that is close to the v1 patch. This is based on the idea that while many
IRQs can be supported by a system, only a handful of them are actually
being used in most cases. We can save a lot of time by focusing on
those active IRQs only and ignore the rests.

Patch 1 is the same as that in v1 while the other 3 patches are new.

Waiman Long (4):
/proc/stat: Extract irqs counting code into show_stat_irqs()
/proc/stat: Only do percpu sum of active IRQs
genirq: Track the number of active IRQs
/proc/stat: Call kstat_irqs_usr() only for active IRQs

fs/proc/stat.c | 123 ++++++++++++++++++++++++++++++++++++++++++++----
include/linux/irqdesc.h | 1 +
kernel/irq/internals.h | 6 ++-
kernel/irq/irqdesc.c | 7 ++-
4 files changed, 125 insertions(+), 12 deletions(-)

--
1.8.3.1