Re: [PATCH v17 11/26] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY to _PAGE_COW

From: Yu, Yu-cheng
Date: Mon Jan 25 2021 - 16:30:10 EST


On 1/25/2021 10:27 AM, Borislav Petkov wrote:
On Tue, Dec 29, 2020 at 01:30:38PM -0800, Yu-cheng Yu wrote:

[...]

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 666c25ab9564..1c84f1ba32b9 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1226,6 +1226,32 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
static inline void ptep_set_wrprotect(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
+ /*
+ * Some processors can start a write, but end up seeing a read-only
+ * PTE by the time they get to the Dirty bit. In this case, they
+ * will set the Dirty bit, leaving a read-only, Dirty PTE which
+ * looks like a shadow stack PTE.
+ *
+ * However, this behavior has been improved

Improved how?

Processors supporting Shadow Stack will not set a read-only PTE's dirty bit. I will revise the comments.

and will not occur on
+ * processors supporting Shadow Stack. Without this guarantee, a

Which guarantee? That it won't happen on CPUs which support SHSTK?


Yes.

+ * transition to a non-present PTE and flush the TLB would be

s/flush the TLB/TLB flush/

+ * needed.
+ *
+ * When changing a writable PTE to read-only and if the PTE has
+ * _PAGE_DIRTY set, move that bit to _PAGE_COW so that the PTE is
+ * not a shadow stack PTE.
+ */

This sentence doesn't belong here as it refers to what pte_wrprotect()
does. You could expand the comment in pte_wrprotect() with this here as
it is better.

I will move this paragraph to pte_wrprotect().


+ if (cpu_feature_enabled(X86_FEATURE_SHSTK)) {
+ pte_t old_pte, new_pte;
+
+ do {
+ old_pte = READ_ONCE(*ptep);
+ new_pte = pte_wrprotect(old_pte);

Maybe I'm missing something but those two can happen outside of the
loop, no? Or is *ptep somehow changing concurrently while the loop is
doing the CMPXCHG and you need to recreate it each time?

IOW, you can generate upfront and do the empty loop...

*ptep can change concurrently.


+
+ } while (!try_cmpxchg(&ptep->pte, &old_pte.pte, new_pte.pte));
+
+ return;
+ }
clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte);
}

[...]