operation ordering during pgd_alloc/pgd_free

From: Jan Beulich
Date: Thu Jun 05 2008 - 04:51:43 EST


At present, pgd_ctor() adds a new pgd to pgd_list solely based on
!SHARED_KERNEL_PMD. For PAE && !SHARED_KERNEL_PMD (i.e. Xen)
this doesn't seem correct, as the pgd is still empty, which will confuse
vmalloc_sync_all(). So in this case, list insertion should only happen at
the end of pgd_prepopulate_pmd().

Likewise, pgd_free() calls pgd_mop_up_pmds() *before* pgd_dtor(),
with the former zeroing pgd entries as it goes and only the latter
removing the pgd from the list. Just as above this can confuse
vmalloc_sync_all(), so here I would think that the two calls should just
be swapped. However, if they get swapped, careful inspection of the
interaction with save/restore will be needed - XenSource's Linux tree
has a comment specifically to that effect:

/*
* After this the pgd should not be pinned for the duration of this
* function's execution. We should never sleep and thus never race:
* 1. User pmds will not become write-protected under our feet due
* to a concurrent mm_pin_all().
* 2. The machine addresses in PGD entries will not become invalid
* due to a concurrent save/restore.
*/

Since that tree doesn't support preemption, this is perhaps fine, but
likely going to cause problems in the (preemptable) pv-ops code.

The issue with vmalloc_sync_all() would even go unnoticed, since the
patch to unify the pgd_list mechanism with x86-64 removed the
BUG_ON() that was meant to trigger on issues like this.

But maybe I'm missing something?

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/