Re: [PATCH v3 29/30] luo: allow preserving memfd
From: Pasha Tatashin
Date: Wed Aug 13 2025 - 09:54:27 EST
On Wed, Aug 13, 2025 at 12:29 PM Pratyush Yadav <pratyush@xxxxxxxxxx> wrote:
>
> Hi Vipin,
>
> Thanks for the review.
>
> On Tue, Aug 12 2025, Vipin Sharma wrote:
>
> > On 2025-08-07 01:44:35, Pasha Tatashin wrote:
> >> From: Pratyush Yadav <ptyadav@xxxxxxxxx>
> >> +static void memfd_luo_unpreserve_folios(const struct memfd_luo_preserved_folio *pfolios,
> >> + unsigned int nr_folios)
> >> +{
> >> + unsigned int i;
> >> +
> >> + for (i = 0; i < nr_folios; i++) {
> >> + const struct memfd_luo_preserved_folio *pfolio = &pfolios[i];
> >> + struct folio *folio;
> >> +
> >> + if (!pfolio->foliodesc)
> >> + continue;
> >> +
> >> + folio = pfn_folio(PRESERVED_FOLIO_PFN(pfolio->foliodesc));
> >> +
> >> + kho_unpreserve_folio(folio);
> >
> > This one is missing WARN_ON_ONCE() similar to the one in
> > memfd_luo_preserve_folios().
>
> Right, will add.
>
> >
> >> + unpin_folio(folio);
>
> Looking at this code caught my eye. This can also be called from LUO's
> finish callback if no one claimed the memfd after live update. In that
> case, unpin_folio() is going to underflow the pincount or refcount on
> the folio since after the kexec, the folio is no longer pinned. We
> should only be doing folio_put().
>
> I think this function should take a argument to specify which of these
> cases it is dealing with.
>
> >> + }
> >> +}
> >> +
> >> +static void *memfd_luo_create_fdt(unsigned long size)
> >> +{
> >> + unsigned int order = get_order(size);
> >> + struct folio *fdt_folio;
> >> + int err = 0;
> >> + void *fdt;
> >> +
> >> + if (order > MAX_PAGE_ORDER)
> >> + return NULL;
> >> +
> >> + fdt_folio = folio_alloc(GFP_KERNEL, order);
> >
> > __GFP_ZERO should also be used here. Otherwise this can lead to
> > unintentional passing of old kernel memory.
>
> fdt_create() zeroes out the buffer so this should not be a problem.
You are right, fdt_create() zeroes the whole buffer, however, I wonder
if it could be `optimized` to only clear only the header part of FDT,
not the rest and this could potentially lead us to send an FDT buffer
that contains both a valid FDT and the trailing bits contain data from
old kernel.
Pasha