Re: preempt bug in set_pmd_pfn?

From: Jeremy Fitzhardinge
Date: Wed Mar 05 2008 - 11:54:20 EST


Hugh Dickins wrote:
Please, Ingo, could you give an example of where such additional locking
is actually necessary?

I ask because I went over those places when splitting the page_table_lock
for userspace in 2.6.15. Some things took init_mm.page_table_lock and
some things didn't, and I concluded that actually none of them needed it.

With the userspace pagetables, we need to guard against racing threads
and vmscan/rmap. But with the kernel pagetables, we'd already be in
serious trouble if two cpus could be modifying the same pte at the
same time - there needs to be other serialization already e.g. vmalloc
has its own locking for parcelling out areas to different uses, so down
at the page table level there should be no conflict.

Allocation of new page tables, yes, that needs locking, and does use
the page_table_lock for kernel space just as for user space.

That was all two years ago, I may have been wrong then, or a lot may
have changed since. But I've heard of a grand total of 0 problems
from not having such locking.

It's not obvious to me what problems might arise, but the Xen set_pte operations do currently rely on preemption being inhibited. At worst I can add preempt_disable/enable calls to them.

And on the original topic of flush TLB without preemption disabled:
again I'm not sure there's a bug there, but it's less clear. Aren't
some of those __flush_tlb_ones just unnecessary, we're simply filling
a previously empty slot? And if there's a guarantee that preemption
will itself involve a TLB flush (maybe there's no such guarantee for
these kernel entries, it's quite a different case from the userspace
one, and you'll be worrying about the global bit), if, then it'd be
okay to __flush_tlb_one without disabling preemption.

If a thread goes from processor A -> B -> A, where A is first preempted between a pagetable update and a tlb flush, then the second time the thread runs on A may run with a stale tlb (if in the meantime A has either been idle or only running kernel threads).

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/