[PATCH v6 0/7] arm_pmu: Use NMI for perf interrupt

From: Alexandru Elisei
Date: Wed Aug 19 2020 - 09:35:52 EST


The series makes the arm_pmu driver use NMIs for the perf interrupt when
NMIs are available on the platform (currently, only arm64 + GICv3). To make
it easier to play with the patches, I've pushed a branch at [1]:

$ git clone -b pmu-nmi-v6 git://linux-arm.org/linux-ae

I've tested the series on an espressobin v7*. These are the results of
running perf record -a -- sleep 60:

1. Without the patches:

16.73% [k] arch_local_irq_enable
12.20% [k] arch_cpu_idle
8.61% [k] _raw_spin_unlock_irqrestore
4.09% [k] arch_local_irq_enable
2.25% [k] arch_local_irq_enable
1.82% [k] arch_counter_get_cntpct
[..]

2. Using NMIs:

3.37% [k] arch_counter_get_cntpct
2.62% [.] _IO_fwrite
1.62% [.] __gconv_transform_ascii_internal
1.49% [.] __mbrtowc
1.44% [k] el0_svc_common
1.31% [.] strchr
[..]

When running perf record -a -- iperf3 -c 127.0.0.1 -t 60:

1. Without the patches:
24.25% [k] __arch_copy_from_user
20.94% [k] __arch_copy_to_user
5.71% [k] arch_local_irq_enable
3.12% [k] _raw_spin_unlock_irqrestore
2.01% [k] __free_pages_ok
1.48% [k] arch_cpu_idle
[..]

2. Using NMIs:

23.15% [k] __arch_copy_from_user
21.68% [k] __arch_copy_to_user
1.23% [k] tcp_ack
1.08% [k] tcp_sendmsg_locked
0.97% [k] rmqueue
0.91% [k] __free_pages_ok
[..]

I've ran the same tests in a VM when both host+guest use NMIs, and when
neither use them. All of these tests were also ran on the model. Similar
results in all cases.

* All the firmware versions for espressobin v7 that I've tried clear
SCR_EL3.FIQ, which means that NMIs don't work. To make them work on the
board, I modified the GICv3 driver. That's why I would really appreciate
someone testing this series on a board where NMIs work without any GIC
changes. For people who want to test the series, but don't have a board
with firmware that sets SCR_EL3.FIQ, I've pushed a branch [2] with the
GICv3 drivers changes necessary to make NMIs work:

$ git clone -b pmu-nmi-v6-nmi-fiq-clear-v2 git://linux-arm.org/linux-ae

Summary of the patches:
* Patch 1 is a fix for a bug that Julien found during the review for v4.
* Patches 2 and 3 remove locking from arm64 perf event code.
* Patches 4 and 5 makes the arm64 PMU interrupt handler NMI safe.
* Patches 6 and 7 enable the use of NMIs on arm64 with a GICv3 irqchip.

Changes since v5 [3]:
- Rebased on top of v5.9-rc1.
- Typo fixes.
- Added comments to the ISB added by patches #1 and #2.
- Reworded message for patch #4, as per Mark's excellent suggestion.

Changes since v4 [4]:
- Rebased on top of v5.8-rc1 and dropped the Tested-by tags because it's
been almost a year since the series has been tested.
- Dropped patch 3 because I couldn't find any instance where
armv7pmu_read_counter() was called with interrupts enabled. I've also
tested this by running several instances of perf for a few hours, and the
function was called every time with interrupts disabled.
- Dropped patches 4 and 5 because the tradeoff wasn't worth it in my
opinion: the irq handler was slower all the time (because it
saved/restored the counter select register), in exchange for being
slightly faster on the rare ocassions when it triggered at the beginning
of the critical sections.
- Minor changes here and there to address review comments.

Changes since v3 [5]:
- Added tags
- Fix build issue for perf_event_v6
- Don't disable preemption in pmu->enable()
- Always rely on IPI_IRQ_WORK to run the queued work
- Fixed typos + cleanups

Changes since v2 [6]:
- Rebased on recent linux-next (next-20190708)
- Fixed a number of bugs with indices (reported by Wei)
- Minor style fixes

Changes since v1 [7]:
- Rebased on v5.1-rc1
- Pseudo-NMI has changed a lot since then, use the (now merged) NMI API
- Remove locking from armv7 perf_event
- Use locking only in armv6 perf_event
- Use direct counter/type registers insted of selector register for armv8

[1] http://www.linux-arm.org/git?p=linux-ae.git;a=shortlog;h=refs/heads/pmu-nmi-v6
[2] http://www.linux-arm.org/git?p=linux-ae.git;a=shortlog;h=refs/heads/pmu-nmi-v6-nmi-fiq-clear-v2
[3] https://www.spinics.net/lists/kernel/msg3554236.html
[4] https://lists.infradead.org/pipermail/linux-arm-kernel/2019-July/666824.html
[5] https://lists.infradead.org/pipermail/linux-arm-kernel/2019-July/665339.html
[6] https://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/640536.html
[7] https://lists.infradead.org/pipermail/linux-arm-kernel/2018-January/554611.html

Alexandru Elisei (1):
arm64: perf: Add missing ISB in armv8pmu_enable_event()

Julien Thierry (5):
arm64: perf: Remove PMU locking
arm64: perf: Defer irq_work to IPI_IRQ_WORK
KVM: arm64: pmu: Make overflow handler NMI safe
arm_pmu: Introduce pmu_irq_ops
arm_pmu: arm64: Use NMIs for PMU

Mark Rutland (1):
arm64: perf: Avoid PMXEV* indirection

arch/arm64/kernel/perf_event.c | 145 +++++++++++++++++++++------------
arch/arm64/kvm/pmu-emul.c | 25 +++++-
drivers/perf/arm_pmu.c | 142 +++++++++++++++++++++++++++-----
include/kvm/arm_pmu.h | 1 +
4 files changed, 241 insertions(+), 72 deletions(-)

--
2.28.0