[PATCH] mm: add describable comment for TLB batch race

From: Minchan Kim
Date: Sun Aug 13 2017 - 21:16:56 EST


[1] is a rather subtle/complicated bug so that it's hard to
understand it with limited code comment.

This patch adds a sequence diagaram to explain the problem
more easily, I hope.

[1] 99baac21e458, mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem

Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Nadav Amit <namit@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
---
mm/memory.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index bcbe56f52163..f571b0eb9816 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -413,12 +413,37 @@ void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
void tlb_finish_mmu(struct mmu_gather *tlb,
unsigned long start, unsigned long end)
{
+
+
/*
* If there are parallel threads are doing PTE changes on same range
* under non-exclusive lock(e.g., mmap_sem read-side) but defer TLB
* flush by batching, a thread has stable TLB entry can fail to flush
* the TLB by observing pte_none|!pte_dirty, for example so flush TLB
* forcefully if we detect parallel PTE batching threads.
+ *
+ * Example: MADV_DONTNEED stale TLB problem on same range
+ *
+ * CPU 0 CPU 1
+ * *a = 1;
+ * MADV_DONTNEED
+ * MADV_DONTNEED tlb_gather_mmu
+ * tlb_gather_mmu
+ * down_read(mmap_sem) down_read(mmap_sem)
+ * pte_lock
+ * pte_get_and_clear
+ * tlb_remove_tlb_entry
+ * pte_unlock
+ * pte_lock
+ * found out the pte is none
+ * pte_unlock
+ * tlb_finish_mmu doesn't flush
+ *
+ * Access the address with stale TLB
+ * *a = 2;ie, success without segfault
+ * tlb_finish_mmu flush on range
+ * but it is too late.
+ *
*/
bool force = mm_tlb_flush_nested(tlb->mm);

--
2.7.4