Re: [PATCH v4 10/45] libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE

From: Alexander Potapenko
Date: Tue Jan 10 2023 - 03:56:35 EST


> > >
> > > >
> > > > -- >8 --
> > > > >From 693563817dea3fd8f293f9b69ec78066ab1d96d2 Mon Sep 17 00:00:00 2001
> > > > From: Dan Williams <dan.j.williams@xxxxxxxxx>
> > > > Date: Thu, 5 Jan 2023 13:27:34 -0800
> > > > Subject: [PATCH] nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE
> > > >
> > > > Commit 6e9f05dc66f9 ("libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE")
> > > >
> > > > ...updated MAX_STRUCT_PAGE_SIZE to account for sizeof(struct page)
> > > > potentially doubling in the case of CONFIG_KMSAN=y. Unfortunately this
> > > > doubles the amount of capacity stolen from user addressable capacity for
> > > > everyone, regardless of whether they are using the debug option. Revert
> > > > that change, mandate that MAX_STRUCT_PAGE_SIZE never exceed 64, but
> > > > allow for debug scenarios to proceed with creating debug sized page maps
> > > > with a new 'libnvdimm.page_struct_override' module parameter.
> > > >
> > > > Note that this only applies to cases where the page map is permanent,
> > > > i.e. stored in a reservation of the pmem itself ("--map=dev" in "ndctl
> > > > create-namespace" terms). For the "--map=mem" case, since the allocation
> > > > is ephemeral for the lifespan of the namespace, there are no explicit
> > > > restriction. However, the implicit restriction, of having enough
> > > > available "System RAM" to store the page map for the typically large
> > > > pmem, still applies.
> > > >
> > > > Fixes: 6e9f05dc66f9 ("libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE")
> > > > Cc: <stable@xxxxxxxxxxxxxxx>
> > > > Cc: Alexander Potapenko <glider@xxxxxxxxxx>
> > > > Cc: Marco Elver <elver@xxxxxxxxxx>
> > > > Reported-by: Jeff Moyer <jmoyer@xxxxxxxxxx>
> > > > ---
> > > > drivers/nvdimm/nd.h | 2 +-
> > > > drivers/nvdimm/pfn_devs.c | 45 ++++++++++++++++++++++++++-------------
> > > > 2 files changed, 31 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
> > > > index 85ca5b4da3cf..ec5219680092 100644
> > > > --- a/drivers/nvdimm/nd.h
> > > > +++ b/drivers/nvdimm/nd.h
> > > > @@ -652,7 +652,7 @@ void devm_namespace_disable(struct device *dev,
> > > > struct nd_namespace_common *ndns);
> > > > #if IS_ENABLED(CONFIG_ND_CLAIM)
> > > > /* max struct page size independent of kernel config */
> > > > -#define MAX_STRUCT_PAGE_SIZE 128
> > > > +#define MAX_STRUCT_PAGE_SIZE 64
> > > > int nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap);
> > > > #else
> > > > static inline int nvdimm_setup_pfn(struct nd_pfn *nd_pfn,
> > > > diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> > > > index 61af072ac98f..978d63559c0e 100644
> > > > --- a/drivers/nvdimm/pfn_devs.c
> > > > +++ b/drivers/nvdimm/pfn_devs.c
> > > > @@ -13,6 +13,11 @@
> > > > #include "pfn.h"
> > > > #include "nd.h"
> > > >
> > > > +static bool page_struct_override;
> > > > +module_param(page_struct_override, bool, 0644);
> > > > +MODULE_PARM_DESC(page_struct_override,
> > > > + "Force namespace creation in the presence of mm-debug.");
> > >
> > > I can't figure out from this description what this is for so perhaps it
> > > should be either removed and made dynamic (if you know you want to debug
> > > the mm core, why not turn it on then?) or made more obvious what is
> > > happening?
> >
> > I'll kill it and update the KMSAN Documentation that KMSAN has
> > interactions with the NVDIMM subsystem that may cause some namespaces to
> > fail to enable. That Documentation needs to be a part of this patch
> > regardless as that would be the default behavior of this module
> > parameter.
> >
> > Unfortunately, it can not be dynamically enabled because the size of
> > 'struct page' is unfortunately recorded in the metadata of the device.
> > Recall this is for supporting platform configurations where the capacity
> > of the persistent memory exceeds or consumes too much of System RAM.
> > Consider 4TB of PMEM consumes 64GB of space just for 'struct page'. So,
> > NVDIMM subsystem has a mode to store that page array in a reservation on
> > the PMEM device itself.
>
> Sorry, I might be missing something, but why cannot we have
>
> #ifdef CONFIG_KMSAN
> #define MAX_STRUCT_PAGE_SIZE 128
By the way, KMSAN only adds 16 bytes to struct page - would it help to
reduce MAX_STRUCT_PAGE_SIZE to 80 bytes?
> #else
> #define MAX_STRUCT_PAGE_SIZE 64
> #endif
>