Re: [PATCH v3 01/11] pagemap: Introduce ->memory_failure()

From: Dan Williams
Date: Mon Mar 08 2021 - 00:26:29 EST


On Sun, Mar 7, 2021 at 7:38 PM ruansy.fnst@xxxxxxxxxxx
<ruansy.fnst@xxxxxxxxxxx> wrote:
>
> > On Mon, Feb 8, 2021 at 2:55 AM Shiyang Ruan <ruansy.fnst@xxxxxxxxxxxxxx> wrote:
> > >
> > > When memory-failure occurs, we call this function which is implemented
> > > by each kind of devices. For the fsdax case, pmem device driver
> > > implements it. Pmem device driver will find out the block device where
> > > the error page locates in, and try to get the filesystem on this block
> > > device. And finally call filesystem handler to deal with the error.
> > > The filesystem will try to recover the corrupted data if possiable.
> > >
> > > Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxxxxx>
> > > ---
> > > include/linux/memremap.h | 8 ++++++++
> > > 1 file changed, 8 insertions(+)
> > >
> > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> > > index 79c49e7f5c30..0bcf2b1e20bd 100644
> > > --- a/include/linux/memremap.h
> > > +++ b/include/linux/memremap.h
> > > @@ -87,6 +87,14 @@ struct dev_pagemap_ops {
> > > * the page back to a CPU accessible page.
> > > */
> > > vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
> > > +
> > > + /*
> > > + * Handle the memory failure happens on one page. Notify the processes
> > > + * who are using this page, and try to recover the data on this page
> > > + * if necessary.
> > > + */
> > > + int (*memory_failure)(struct dev_pagemap *pgmap, unsigned long pfn,
> > > + int flags);
> > > };
> >
> > After the conversation with Dave I don't see the point of this. If
> > there is a memory_failure() on a page, why not just call
> > memory_failure()? That already knows how to find the inode and the
> > filesystem can be notified from there.
>
> We want memory_failure() supports reflinked files. In this case, we are not
> able to track multiple files from a page(this broken page) because
> page->mapping,page->index can only track one file. Thus, I introduce this
> ->memory_failure() implemented in pmem driver, to call ->corrupted_range()
> upper level to upper level, and finally find out files who are
> using(mmapping) this page.
>

I know the motivation, but this implementation seems backwards. It's
already the case that memory_failure() looks up the address_space
associated with a mapping. From there I would expect a new 'struct
address_space_operations' op to let the fs handle the case when there
are multiple address_spaces associated with a given file.