Re: [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v3)

From: Peter Zijlstra
Date: Wed Oct 07 2020 - 10:30:03 EST


On Thu, Sep 24, 2020 at 01:25:06PM -0400, Mathieu Desnoyers wrote:
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 733e80f334e7..0767a2dbf245 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -475,7 +475,19 @@ static void exit_mm(void)
> BUG_ON(mm != current->active_mm);
> /* more a memory barrier than a real lock */
> task_lock(current);
> + /*
> + * When a thread stops operating on an address space, the loop
> + * in membarrier_private_expedited() may not observe that
> + * tsk->mm, and the loop in membarrier_global_expedited() may
> + * not observe a MEMBARRIER_STATE_GLOBAL_EXPEDITED
> + * rq->membarrier_state, so those would not issue an IPI.
> + * Membarrier requires a memory barrier after accessing
> + * user-space memory, before clearing tsk->mm or the
> + * rq->membarrier_state.
> + */
> + smp_mb__after_spinlock();
> current->mm = NULL;
> + membarrier_update_current_mm(NULL);
> mmap_read_unlock(mm);
> enter_lazy_tlb(mm, current);
> task_unlock(current);

This site seems to be lacking in IRQ disabling. As proposed it will
explode on RT.

Something like so to match kthread_unuse_mm().

--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -486,11 +486,13 @@ static void exit_mm(void)
* rq->membarrier_state.
*/
smp_mb__after_spinlock();
+ local_irq_disable()
current->mm = NULL;
membarrier_update_current_mm(NULL);
- mmap_read_unlock(mm);
enter_lazy_tlb(mm, current);
+ local_irq_enable();
task_unlock(current);
+ mmap_read_unlock(mm);
mm_update_next_owner(mm);
mmput(mm);
if (test_thread_flag(TIF_MEMDIE))