Re: [PATCH] doc: document scope NOFS, NOIO APIs

From: Dave Chinner
Date: Sun May 27 2018 - 19:49:08 EST


On Fri, May 25, 2018 at 10:16:24AM +0200, Michal Hocko wrote:
> On Fri 25-05-18 08:17:15, Dave Chinner wrote:
> > On Thu, May 24, 2018 at 01:43:41PM +0200, Michal Hocko wrote:
> [...]
> > > +FS/IO code then simply calls the appropriate save function right at the
> > > +layer where a lock taken from the reclaim context (e.g. shrinker) and
> > > +the corresponding restore function when the lock is released. All that
> > > +ideally along with an explanation what is the reclaim context for easier
> > > +maintenance.
> >
> > This paragraph doesn't make much sense to me. I think you're trying
> > to say that we should call the appropriate save function "before
> > locks are taken that a reclaim context (e.g a shrinker) might
> > require access to."
> >
> > I think it's also worth making a note about recursive/nested
> > save/restore stacking, because it's not clear from this description
> > that this is allowed and will work as long as inner save/restore
> > calls are fully nested inside outer save/restore contexts.
>
> Any better?
>
> -FS/IO code then simply calls the appropriate save function right at the
> -layer where a lock taken from the reclaim context (e.g. shrinker) and
> -the corresponding restore function when the lock is released. All that
> -ideally along with an explanation what is the reclaim context for easier
> -maintenance.
> +FS/IO code then simply calls the appropriate save function before any
> +lock shared with the reclaim context is taken. The corresponding
> +restore function when the lock is released. All that ideally along with
> +an explanation what is the reclaim context for easier maintenance.
> +
> +Please note that the proper pairing of save/restore function allows nesting
> +so memalloc_noio_save is safe to be called from an existing NOIO or NOFS scope.

It's better, but the talk of this being necessary for locking makes
me cringe. XFS doesn't do it for locking reasons - it does it
largely for preventing transaction context nesting, which has all
sorts of problems that cause hangs (e.g. log space reservations
can't be filled) that aren't directly locking related.

i.e we should be talking about using these functions around contexts
where recursion back into the filesystem through reclaim is
problematic, not that "holding locks" is problematic. Locks can be
used as an example of a problematic context, but locks are not the
only recursion issue that require GFP_NOFS allocation contexts to
avoid.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx