Re: [PATCH 1/2] mm/migrate_device.c: Copy pte dirty bit to page
From: huang ying
Date: Tue Aug 16 2022 - 02:44:22 EST
Hi, Alistair,
On Fri, Aug 12, 2022 at 1:23 PM Alistair Popple <apopple@xxxxxxxxxx> wrote:
>
> migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that
> installs migration entries directly if it can lock the migrating page.
> When removing a dirty pte the dirty bit is supposed to be carried over
> to the underlying page to prevent it being lost.
>
> Currently migrate_vma_*() can only be used for private anonymous
> mappings. That means loss of the dirty bit usually doesn't result in
> data loss because these pages are typically not file-backed. However
> pages may be backed by swap storage which can result in data loss if an
> attempt is made to migrate a dirty page that doesn't yet have the
> PageDirty flag set.
>
> In this case migration will fail due to unexpected references but the
> dirty pte bit will be lost. If the page is subsequently reclaimed data
> won't be written back to swap storage as it is considered uptodate,
> resulting in data loss if the page is subsequently accessed.
>
> Prevent this by copying the dirty bit to the page when removing the pte
> to match what try_to_migrate_one() does.
>
> Signed-off-by: Alistair Popple <apopple@xxxxxxxxxx>
> Reported-by: Peter Xu <peterx@xxxxxxxxxx>
> ---
> mm/migrate_device.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> index 27fb37d..d38f8a6 100644
> --- a/mm/migrate_device.c
> +++ b/mm/migrate_device.c
> @@ -7,6 +7,7 @@
> #include <linux/export.h>
> #include <linux/memremap.h>
> #include <linux/migrate.h>
> +#include <linux/mm.h>
> #include <linux/mm_inline.h>
> #include <linux/mmu_notifier.h>
> #include <linux/oom.h>
> @@ -211,6 +212,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>
> migrate->cpages++;
>
> + /* Set the dirty flag on the folio now the pte is gone. */
> + if (pte_dirty(pte))
> + folio_mark_dirty(page_folio(page));
> +
I think that this isn't sufficient to fix all issues. Firstly, "pte"
is assigned at the begin of the loop, before the PTE is cleared via
ptep_clear_flush() or ptep_get_and_clear(). That is, the pte isn't
changed atomically. Between "pte" assignment and PTE clear, the PTE
may become dirty. That is, we need to update pte when we clear the
PTE.
And I don't know why we use ptep_get_and_clear() to clear PTE if
(!anon_exclusive). Why don't we need to flush the TLB?
Best Regards,
Huang, Ying
> /* Setup special migration page table entry */
> if (mpfn & MIGRATE_PFN_WRITE)
> entry = make_writable_migration_entry(
>
> base-commit: ffcf9c5700e49c0aee42dcba9a12ba21338e8136
> --
> git-series 0.9.1
>