Re: [PATCH v6 0/7] arm_pmu: Use NMI for perf interrupt

From: Sumit Garg
Date: Fri Sep 04 2020 - 04:58:34 EST


Hi Alex,

On Wed, 19 Aug 2020 at 19:03, Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote:
>
> The series makes the arm_pmu driver use NMIs for the perf interrupt when
> NMIs are available on the platform (currently, only arm64 + GICv3). To make
> it easier to play with the patches, I've pushed a branch at [1]:
>
> $ git clone -b pmu-nmi-v6 git://linux-arm.org/linux-ae
>
> I've tested the series on an espressobin v7*. These are the results of
> running perf record -a -- sleep 60:
>
> 1. Without the patches:
>
> 16.73% [k] arch_local_irq_enable
> 12.20% [k] arch_cpu_idle
> 8.61% [k] _raw_spin_unlock_irqrestore
> 4.09% [k] arch_local_irq_enable
> 2.25% [k] arch_local_irq_enable
> 1.82% [k] arch_counter_get_cntpct
> [..]
>
> 2. Using NMIs:
>
> 3.37% [k] arch_counter_get_cntpct
> 2.62% [.] _IO_fwrite
> 1.62% [.] __gconv_transform_ascii_internal
> 1.49% [.] __mbrtowc
> 1.44% [k] el0_svc_common
> 1.31% [.] strchr
> [..]
>
> When running perf record -a -- iperf3 -c 127.0.0.1 -t 60:
>
> 1. Without the patches:
> 24.25% [k] __arch_copy_from_user
> 20.94% [k] __arch_copy_to_user
> 5.71% [k] arch_local_irq_enable
> 3.12% [k] _raw_spin_unlock_irqrestore
> 2.01% [k] __free_pages_ok
> 1.48% [k] arch_cpu_idle
> [..]
>
> 2. Using NMIs:
>
> 23.15% [k] __arch_copy_from_user
> 21.68% [k] __arch_copy_to_user
> 1.23% [k] tcp_ack
> 1.08% [k] tcp_sendmsg_locked
> 0.97% [k] rmqueue
> 0.91% [k] __free_pages_ok
> [..]
>
> I've ran the same tests in a VM when both host+guest use NMIs, and when
> neither use them. All of these tests were also ran on the model. Similar
> results in all cases.
>
> * All the firmware versions for espressobin v7 that I've tried clear
> SCR_EL3.FIQ, which means that NMIs don't work. To make them work on the
> board, I modified the GICv3 driver. That's why I would really appreciate
> someone testing this series on a board where NMIs work without any GIC
> changes.

This series works perfectly fine for me without any GIC changes on
Developerbox. FWIW:

Tested-by: Sumit Garg <sumit.garg@xxxxxxxxxx> (Developerbox)

-Sumit

> For people who want to test the series, but don't have a board
> with firmware that sets SCR_EL3.FIQ, I've pushed a branch [2] with the
> GICv3 drivers changes necessary to make NMIs work:
>
> $ git clone -b pmu-nmi-v6-nmi-fiq-clear-v2 git://linux-arm.org/linux-ae
>
> Summary of the patches:
> * Patch 1 is a fix for a bug that Julien found during the review for v4.
> * Patches 2 and 3 remove locking from arm64 perf event code.
> * Patches 4 and 5 makes the arm64 PMU interrupt handler NMI safe.
> * Patches 6 and 7 enable the use of NMIs on arm64 with a GICv3 irqchip.
>
> Changes since v5 [3]:
> - Rebased on top of v5.9-rc1.
> - Typo fixes.
> - Added comments to the ISB added by patches #1 and #2.
> - Reworded message for patch #4, as per Mark's excellent suggestion.
>
> Changes since v4 [4]:
> - Rebased on top of v5.8-rc1 and dropped the Tested-by tags because it's
> been almost a year since the series has been tested.
> - Dropped patch 3 because I couldn't find any instance where
> armv7pmu_read_counter() was called with interrupts enabled. I've also
> tested this by running several instances of perf for a few hours, and the
> function was called every time with interrupts disabled.
> - Dropped patches 4 and 5 because the tradeoff wasn't worth it in my
> opinion: the irq handler was slower all the time (because it
> saved/restored the counter select register), in exchange for being
> slightly faster on the rare ocassions when it triggered at the beginning
> of the critical sections.
> - Minor changes here and there to address review comments.
>
> Changes since v3 [5]:
> - Added tags
> - Fix build issue for perf_event_v6
> - Don't disable preemption in pmu->enable()
> - Always rely on IPI_IRQ_WORK to run the queued work
> - Fixed typos + cleanups
>
> Changes since v2 [6]:
> - Rebased on recent linux-next (next-20190708)
> - Fixed a number of bugs with indices (reported by Wei)
> - Minor style fixes
>
> Changes since v1 [7]:
> - Rebased on v5.1-rc1
> - Pseudo-NMI has changed a lot since then, use the (now merged) NMI API
> - Remove locking from armv7 perf_event
> - Use locking only in armv6 perf_event
> - Use direct counter/type registers insted of selector register for armv8
>
> [1] http://www.linux-arm.org/git?p=linux-ae.git;a=shortlog;h=refs/heads/pmu-nmi-v6
> [2] http://www.linux-arm.org/git?p=linux-ae.git;a=shortlog;h=refs/heads/pmu-nmi-v6-nmi-fiq-clear-v2
> [3] https://www.spinics.net/lists/kernel/msg3554236.html
> [4] https://lists.infradead.org/pipermail/linux-arm-kernel/2019-July/666824.html
> [5] https://lists.infradead.org/pipermail/linux-arm-kernel/2019-July/665339.html
> [6] https://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/640536.html
> [7] https://lists.infradead.org/pipermail/linux-arm-kernel/2018-January/554611.html
>
> Alexandru Elisei (1):
> arm64: perf: Add missing ISB in armv8pmu_enable_event()
>
> Julien Thierry (5):
> arm64: perf: Remove PMU locking
> arm64: perf: Defer irq_work to IPI_IRQ_WORK
> KVM: arm64: pmu: Make overflow handler NMI safe
> arm_pmu: Introduce pmu_irq_ops
> arm_pmu: arm64: Use NMIs for PMU
>
> Mark Rutland (1):
> arm64: perf: Avoid PMXEV* indirection
>
> arch/arm64/kernel/perf_event.c | 145 +++++++++++++++++++++------------
> arch/arm64/kvm/pmu-emul.c | 25 +++++-
> drivers/perf/arm_pmu.c | 142 +++++++++++++++++++++++++++-----
> include/kvm/arm_pmu.h | 1 +
> 4 files changed, 241 insertions(+), 72 deletions(-)
>
> --
> 2.28.0
>