Re: [PATCH] i386: PAE entries must have their low word cleared first

From: Hugh Dickins
Date: Wed Apr 26 2006 - 11:57:48 EST


On Wed, 26 Apr 2006, Zachary Amsden wrote:
> Keir Fraser wrote:
> >
> > We cannot use pte_clear() unless we redefine it for PAE. Currently it
> > reduces to set_pte() which explicitly uses the wrong ordering (sets high
> > *then* low, because it's normally used to introduce a mapping).
>
> Yes, that means pte_clear() has the same bug. Why don't we redefine pte_clear
> for PAE? Then the ifdef can go away.

Good, we agree.

> > Also, I think wmb() should be used in PAE's set_pte(), rather than the
> > current smp_wmb(). They reduce to the same thing, but wmb() makes it clear
> > this is is not an issue specific to SMP systems.
>
> I agree with that. The problem is not related to SMP at all. I would propose
> this approach.

Attachment included...

> Proposed fix for ptep_get_and_clear_full PAE bug. Pte_clear had the same
> bug, so use the same fix for both.

You need to expand that comment with text from Jan & Keir's patch:
Andrew will want to know just what this bug was. The patch looks
good to me, except for the unnecessary do { } while (0) in the
definition of pte_clear: ah, you're only copying what was already
there, can't blame you for that, let's not worry about it.

Hugh

> Signed-off-by: Zachary Amsden <zach@xxxxxxxxxx>
>
> Index: linux-2.6.17-rc/include/asm-i386/pgtable.h
> ===================================================================
> --- linux-2.6.17-rc.orig/include/asm-i386/pgtable.h 2006-04-25 15:41:06.000000000 -0700
> +++ linux-2.6.17-rc/include/asm-i386/pgtable.h 2006-04-26 08:39:42.000000000 -0700
> @@ -204,7 +204,6 @@ extern unsigned long long __PAGE_KERNEL,
> extern unsigned long pg0[];
>
> #define pte_present(x) ((x).pte_low & (_PAGE_PRESENT | _PAGE_PROTNONE))
> -#define pte_clear(mm,addr,xp) do { set_pte_at(mm, addr, xp, __pte(0)); } while (0)
>
> /* To avoid harmful races, pmd_none(x) should check only the lower when PAE */
> #define pmd_none(x) (!(unsigned long)pmd_val(x))
> @@ -268,7 +267,7 @@ static inline pte_t ptep_get_and_clear_f
> pte_t pte;
> if (full) {
> pte = *ptep;
> - *ptep = __pte(0);
> + pte_clear(mm, addr, ptep);
> } else {
> pte = ptep_get_and_clear(mm, addr, ptep);
> }
> Index: linux-2.6.17-rc/include/asm-i386/pgtable-3level.h
> ===================================================================
> --- linux-2.6.17-rc.orig/include/asm-i386/pgtable-3level.h 2006-04-25 15:41:06.000000000 -0700
> +++ linux-2.6.17-rc/include/asm-i386/pgtable-3level.h 2006-04-26 08:38:57.000000000 -0700
> @@ -85,6 +85,13 @@ static inline void pud_clear (pud_t * pu
> #define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \
> pmd_index(address))
>
> +static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> +{
> + ptep->pte_low = 0;
> + wmb();
> + ptep->pte_high = 0;
> +}
> +
> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> pte_t res;
> Index: linux-2.6.17-rc/include/asm-i386/pgtable-2level.h
> ===================================================================
> --- linux-2.6.17-rc.orig/include/asm-i386/pgtable-2level.h 2006-04-25 15:41:06.000000000 -0700
> +++ linux-2.6.17-rc/include/asm-i386/pgtable-2level.h 2006-04-26 08:37:34.000000000 -0700
> @@ -18,6 +18,8 @@
> #define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval)
> #define set_pmd(pmdptr, pmdval) (*(pmdptr) = (pmdval))
>
> +#define pte_clear(mm,addr,xp) do { set_pte_at(mm, addr, xp, __pte(0)); } while (0)
> +
> #define ptep_get_and_clear(mm,addr,xp) __pte(xchg(&(xp)->pte_low, 0))
> #define pte_same(a, b) ((a).pte_low == (b).pte_low)
> #define pte_page(x) pfn_to_page(pte_pfn(x))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/