Re: [PATCH mmotm] mm/munlock: mlock_vma_folio() check against VM_SPECIAL

From: Matthew Wilcox
Date: Thu Mar 03 2022 - 09:24:00 EST


On Wed, Mar 02, 2022 at 05:35:30PM -0800, Hugh Dickins wrote:
> Although mmap_region() and mlock_fixup() take care that VM_LOCKED
> is never left set on a VM_SPECIAL vma, there is an interval while
> file->f_op->mmap() is using vm_insert_page(s), when VM_LOCKED may
> still be set while VM_SPECIAL bits are added: so mlock_vma_folio()
> should ignore VM_LOCKED while any VM_SPECIAL bits are set.
>
> This showed up as a "Bad page" still mlocked, when vfree()ing pages
> which had been vm_inserted by remap_vmalloc_range_partial(): while
> release_pages() and __page_cache_release(), and so put_page(), catch
> pages still mlocked when freeing (and clear_page_mlock() caught them
> when unmapping), the vfree() path is unprepared for them: fix it?
> but these pages should not have been mlocked in the first place.
>
> I assume that an mlockall(MCL_FUTURE) had been done in the past; or
> maybe the user got to specify MAP_LOCKED on a vmalloc'ing driver mmap.
>
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> ---
> Diffed against top of next-20220301 or mmotm 2022-02-28-14-45.
> This patch really belongs as a fix to the mm/munlock series in
> Matthew's tree, so he might like to take it in there (but the patch
> here is the foliated version, so easiest to place it after foliation).

It looks like it fixes "mm/munlock: mlock_pte_range() when mlocking or
munlocking", so I'll fold it into that patch?

> mm/internal.h | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -421,8 +421,15 @@ extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
> static inline void mlock_vma_folio(struct folio *folio,
> struct vm_area_struct *vma, bool compound)
> {
> - /* VM_IO check prevents migration from double-counting during mlock */
> - if (unlikely((vma->vm_flags & (VM_LOCKED|VM_IO)) == VM_LOCKED) &&
> + /*
> + * The VM_SPECIAL check here serves two purposes.
> + * 1) VM_IO check prevents migration from double-counting during mlock.
> + * 2) Although mmap_region() and mlock_fixup() take care that VM_LOCKED
> + * is never left set on a VM_SPECIAL vma, there is an interval while
> + * file->f_op->mmap() is using vm_insert_page(s), when VM_LOCKED may
> + * still be set while VM_SPECIAL bits are added: so ignore it then.
> + */
> + if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) == VM_LOCKED) &&
> (compound || !folio_test_large(folio)))
> mlock_folio(folio);
> }