Re: [PATCH RFC 10/10] nvdimm/e820: add multiple namespaces support

From: Dan Williams
Date: Tue Feb 04 2020 - 16:43:51 EST


On Tue, Feb 4, 2020 at 10:20 AM Barret Rhoden <brho@xxxxxxxxxx> wrote:
>
> Hi -
>
> On 2/4/20 11:44 AM, Dan Williams wrote:
> > On Tue, Feb 4, 2020 at 7:30 AM Barret Rhoden <brho@xxxxxxxxxx> wrote:
> >>
> >> Hi -
> >>
> >> On 1/10/20 2:03 PM, Joao Martins wrote:
> >>> User can define regions with 'memmap=size!offset' which in turn
> >>> creates PMEM legacy devices. But because it is a label-less
> >>> NVDIMM device we only have one namespace for the whole device.
> >>>
> >>> Add support for multiple namespaces by adding ndctl control
> >>> support, and exposing a minimal set of features:
> >>> (ND_CMD_GET_CONFIG_SIZE, ND_CMD_GET_CONFIG_DATA,
> >>> ND_CMD_SET_CONFIG_DATA) alongside NDD_ALIASING because we can
> >>> store labels.
> >>
> >> FWIW, I like this a lot. If we move away from using memmap in favor of
> >> efi_fake_mem, ideally we'd have the same support for full-fledged
> >> pmem/dax regions and namespaces that this patch brings.
> >
> > No, efi_fake_mem only supports creating dax-regions. What's the use
> > case that can't be satisfied by just specifying multiple memmap=
> > ranges?
>
> I'd like to be able to create and destroy dax regions on the fly. In
> particular, I want to run guest VMs using the dax files for guest
> memory, but I don't know at boot time how many VMs I'll have, or what
> their sizes are. Ideally, I'd have separate files for each VM, instead
> of a single /dev/dax.
>
> I currently do this with fs-dax with one big memmap region (ext4 on
> /dev/pmem0), and I use the file system to handle the
> creation/destruction/resizing and metadata management. But since fs-dax
> won't work with device pass-through, I started looking at dev-dax, with
> the expectation that I'd need some software to manage the memory (i.e.
> allocation). That led me to ndctl, which seems to need namespace labels
> to have the level of control I was looking for.

Ah, got it, you only ended up at wanting namespace labels because
there was no other way to carve up device-dax. That's changing as part
of the efi_fake_mem= enabling and I have a patch set in the works to
allow discontiguous sub-divisions of a device-dax range. Note that is
this branch rebases frequently:

https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=libnvdimm-pending


>
> Thanks,
>
> Barret
>