Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement andmigration policy
From: Ingo Molnar
Date: Fri Oct 26 2012 - 09:50:21 EST
* Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> [
> task_numa_work() performance side note:
>
> We are also *very* close to be able to use down_read() instead
> of down_write() in the sampling-unmap code in
> task_numa_work(), as it should be safe in theory to call
> change_protection(PROT_NONE) in parallel - but there's one
> regression that disagrees with this theory so we use
> down_write() at the moment.
>
> Maybe you could help us there: can you see a reason why the
> change_prot_none()->change_protection() call in
> task_numa_work() can not occur in parallel to a page fault in
> another thread on another CPU? It should be safe - yet if we
> change it I can see occasional corruption of user-space state:
> segfaults and register corruption.
> ]
Oh, just found the reason:
the ptep_modify_prot_start()/modify()/commit() sequence is
SMP-unsafe - it has to be done under the mmap_sem write-locked.
It is safe against *hardware* updates to the PTE, but not safe
against itself.
This is apparently a hidden cost of paravirt, it is forcing that
weird sequence and thus the down_write() ...
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/