Re: [PATCH v2 02/11] ARC: send ipi to all cpus sharing task mm in case of page fault

From: Vineet Gupta
Date: Tue May 30 2017 - 12:40:57 EST


On 05/27/2017 11:51 PM, Noam Camus wrote:
From: Noam Camus <noamca@xxxxxxxxxxxx>

This patch is derived due to performance issue.
The use case is a page fault that resides on more than the local cpu.
Trying to broadcast all CPUs results on performance degradation.
So we try to avoid this by sending only to the relevant CPUs.

Signed-off-by: Noam Camus <noamca@xxxxxxxxxxxx>
Reviewed-by: Alexey Brodkin <abrodkin@xxxxxxxxxxxx>

This indeed looks like a nice optimization - do you have any performance numbers when say running hackbench or other multi-threaded workloads !

-Vineet

---
arch/arc/include/asm/cacheflush.h | 3 ++-
arch/arc/mm/cache.c | 12 ++++++++++--
arch/arc/mm/tlb.c | 2 +-
3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arc/include/asm/cacheflush.h b/arch/arc/include/asm/cacheflush.h
index fc662f4..716dba1 100644
--- a/arch/arc/include/asm/cacheflush.h
+++ b/arch/arc/include/asm/cacheflush.h
@@ -33,7 +33,8 @@
void flush_icache_range(unsigned long kstart, unsigned long kend);
void __sync_icache_dcache(phys_addr_t paddr, unsigned long vaddr, int len);
-void __inv_icache_page(phys_addr_t paddr, unsigned long vaddr);
+void __inv_icache_page(struct vm_area_struct *vma,
+ phys_addr_t paddr, unsigned long vaddr);
void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr);
#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c
index 7d3e79b..e1ea57f 100644
--- a/arch/arc/mm/cache.c
+++ b/arch/arc/mm/cache.c
@@ -934,9 +934,17 @@ void __sync_icache_dcache(phys_addr_t paddr, unsigned long vaddr, int len)
}
/* wrapper to compile time eliminate alignment checks in flush loop */
-void __inv_icache_page(phys_addr_t paddr, unsigned long vaddr)
+void __inv_icache_page(struct vm_area_struct *vma,
+ phys_addr_t paddr, unsigned long vaddr)
{
- __ic_line_inv_vaddr(paddr, vaddr, PAGE_SIZE);
+ struct ic_inv_args ic_inv = {
+ .paddr = paddr,
+ .vaddr = vaddr,
+ .sz = PAGE_SIZE
+ };
+
+ on_each_cpu_mask(mm_cpumask(vma->vm_mm),
+ __ic_line_inv_vaddr_helper, &ic_inv, 1);
}
/*
diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c
index c5e70d8..a095608 100644
--- a/arch/arc/mm/tlb.c
+++ b/arch/arc/mm/tlb.c
@@ -626,7 +626,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long vaddr_unaligned,
/* invalidate any existing icache lines (U-mapping) */
if (vma->vm_flags & VM_EXEC)
- __inv_icache_page(paddr, vaddr);
+ __inv_icache_page(vma, paddr, vaddr);
}
}
}