Re: [PATCH] riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred()

From: Palmer Dabbelt
Date: Wed May 26 2021 - 01:30:19 EST


On Tue, 11 May 2021 10:42:31 PDT (-0700), jszhang3@xxxxxxxxxxxxxxxx wrote:
From: Jisheng Zhang <jszhang@xxxxxxxxxx>

Directly passing the cpu to flush_icache_deferred() rather than calling
smp_processor_id() again.

Here are some performance numbers:

With a run of hackbench 30 times on a single core riscv64 Qemu instance
with 1GB memory:

without this patch: mean 36.934
with this patch: mean 36.104 (improved by 2.24%)

I don't really put any stock in QEMU performance numbers for this sort of thing, but for something like this where we're just skipping an expensive call I don't really see a reason to even need performance numbers at all as we can just consider it an obvious optimization.

Signed-off-by: Jisheng Zhang <jszhang@xxxxxxxxxx>
---
arch/riscv/mm/context.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 68aa312fc352..6d445f2888ec 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -281,10 +281,9 @@ static inline void set_mm(struct mm_struct *mm, unsigned int cpu)
* actually performs that local instruction cache flush, which implicitly only
* refers to the current hart.
*/
-static inline void flush_icache_deferred(struct mm_struct *mm)
+static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu)
{
#ifdef CONFIG_SMP
- unsigned int cpu = smp_processor_id();
cpumask_t *mask = &mm->context.icache_stale_mask;

if (cpumask_test_cpu(cpu, mask)) {

This proceeds to perform only a local icache flush, which means it will break if any callers use a different CPU number. That large comment at the top alludes to this behavior, but I went ahead and added a line to make that more explicit.

@@ -320,5 +319,5 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,

set_mm(next, cpu);

- flush_icache_deferred(next);
+ flush_icache_deferred(next, cpu);
}

Thanks, this is on for-next.