Hm, do you recall what processors that might affect? As far as I know,
current processors will ignore non-present top-level entries. Anyway,
we can point them not present to empty_zero_page, so testing the present
bit will still be sufficient to tell if we need to allocate a new pmd,
but if the hardware decides to follow the page reference there's no harm
done. (Hm, unless the hardware decides it wants to set A or D bits in
empty_zero_page for some reason...)
What earlier CPU's did was to basically load all four values into the CPU when you loaded %cr3. There was no "three-level page table walker" at all: it was still a two-level page table walker, there were just for magic internal page tables that were indexed off the two high bits.
That just means we need to reload cr3 after populating the pgd with a
new pmd, right?