Re: [v6 08/15] mm: zero struct pages during initialization

From: Michal Hocko
Date: Fri Aug 11 2017 - 08:50:55 EST


On Mon 07-08-17 16:38:42, Pavel Tatashin wrote:
> Add struct page zeroing as a part of initialization of other fields in
> __init_single_page().

I believe this deserves much more detailed explanation why this is safe.
What actually prevents any pfn walker from seeing an uninitialized
struct page? Please make your assumptions explicit in the commit log so
that we can check them independently.

Also this is done with some purpose which is the perfmance, right? You
have mentioned that in the cover letter but if somebody is going to read
through git logs this wouldn't be obvious from the specific commit.
So add that information here as well. Especially numbers will be
interesting.

As a sidenote, this will need some more followups for memory hotplug
after my recent changes which are not merged yet but I will take care of
that.

> Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
> Reviewed-by: Steven Sistare <steven.sistare@xxxxxxxxxx>
> Reviewed-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx>
> Reviewed-by: Bob Picco <bob.picco@xxxxxxxxxx>

After the relevant information is added feel free add
Acked-by: Michal Hocko <mhocko@xxxxxxxx>

> ---
> include/linux/mm.h | 9 +++++++++
> mm/page_alloc.c | 1 +
> 2 files changed, 10 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 46b9ac5e8569..183ac5e733db 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -93,6 +93,15 @@ extern int mmap_rnd_compat_bits __read_mostly;
> #define mm_forbids_zeropage(X) (0)
> #endif
>
> +/*
> + * On some architectures it is expensive to call memset() for small sizes.
> + * Those architectures should provide their own implementation of "struct page"
> + * zeroing by defining this macro in <asm/pgtable.h>.
> + */
> +#ifndef mm_zero_struct_page
> +#define mm_zero_struct_page(pp) ((void)memset((pp), 0, sizeof(struct page)))
> +#endif
> +
> /*
> * Default maximum number of active map areas, this limits the number of vmas
> * per mm struct. Users can overwrite this number by sysctl but there is a
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 983de0a8047b..4d32c1fa4c6c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1168,6 +1168,7 @@ static void free_one_page(struct zone *zone,
> static void __meminit __init_single_page(struct page *page, unsigned long pfn,
> unsigned long zone, int nid)
> {
> + mm_zero_struct_page(page);
> set_page_links(page, zone, nid, pfn);
> init_page_count(page);
> page_mapcount_reset(page);
> --
> 2.14.0

--
Michal Hocko
SUSE Labs