Re: [PATCH RFC 01/10] rcu: Make CONFIG_SRCU unconditionally enabled

From: Dave Chinner
Date: Wed Aug 08 2018 - 17:31:32 EST


On Wed, Aug 08, 2018 at 12:27:34PM +0200, Michal Hocko wrote:
> [CC Josh - the whole series is
> http://lkml.kernel.org/r/153365347929.19074.12509495712735843805.stgit@xxxxxxxxxxxxxxxxxxxxx]
>
> On Wed 08-08-18 13:17:44, Kirill Tkhai wrote:
> > On 08.08.2018 10:20, Michal Hocko wrote:
> > > On Tue 07-08-18 18:37:36, Kirill Tkhai wrote:
> > >> This patch kills all CONFIG_SRCU defines and
> > >> the code under !CONFIG_SRCU.
> > >
> > > The last time somebody tried to do this there was a pushback due to
> > > kernel tinyfication. So this should really give some numbers about the
> > > code size increase. Also why can't we make this depend on MMU. Is
> > > anybody else than the reclaim asking for unconditional SRCU usage?
> >
> > I don't know one. The size numbers (sparc64) are:
> >
> > $ size image.srcu.disabled
> > text data bss dec hex filename
> > 5117546 8030506 1968104 15116156 e6a77c image.srcu.disabled
> > $ size image.srcu.enabled
> > text data bss dec hex filename
> > 5126175 8064346 1968104 15158625 e74d61 image.srcu.enabled
> > The difference is: 15158625-15116156 = 42469 ~41Kb
> >
> > Please, see the measurement details to my answer to Stephen.
> >
> > > Btw. I totaly agree with Steven. This is a very poor changelog. It is
> > > trivial to see what the patch does but it is far from clear why it is
> > > doing that and why we cannot go other ways.
> > We possibly can go another way, and there is comment to [2/10] about this.
> > Percpu rwsem may be used instead, the only thing, it is worse, is it will
> > make shrink_slab() wait unregistering shrinkers, while srcu-based
> > implementation does not require this.
>
> Well, if unregisterring doesn't do anything subtle - e.g. an allocation
> or take locks which depend on allocation - and we can guarantee that
> then blocking shrink_slab shouldn't be a big deal.

unregister_shrinker() already blocks shrink_slab - taking a rwsem in
write mode blocks all readers - so using a per-cpu rwsem doesn't
introduce anything new or unexpected. I'd like to see numbers of the
different methods before anything else.

IMO, the big deal is that the split unregister mechanism seems to
imply superblock shrinkers can be called during sb teardown or
/after/ the filesystem has been torn down in memory (i.e. after
->put_super() is called). That's a change of behaviour, but it's
left to the filesystem to detect and handle that condition. That's
exceedingly subtle and looks like a recipe for disaster to me. I
note that XFS hasn't been updated to detect and avoid this landmine.

And, FWIW, filesystems with multiple shrinkers (e.g. XFS as 3 per
mount) will take the SCRU penalty multiple times during unmount, and
potentially be exposed to multiple different "use during/after
teardown" race conditions.

> It is subtle though.
> Maybe subtle enough to make unconditional SRCU worth it. This all should
> be in the changelog though.

IMO, we've had enough recent bugs to deal with from shrinkers being
called before the filesystem is set up and from trying to handle
allocation errors during setup. Do we really want to make shrinker
shutdown just as prone to mismanagement and subtle, hard to hit
bugs? I don't think we do - unmount is simply not a critical
performance path.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx