Re: [PATCH 2/5] kernel.h: Add non_block_start/end()

From: Jason Gunthorpe
Date: Thu Aug 15 2019 - 13:17:52 EST


On Thu, Aug 15, 2019 at 01:11:56PM -0400, Jerome Glisse wrote:
> On Thu, Aug 15, 2019 at 01:56:31PM -0300, Jason Gunthorpe wrote:
> > On Thu, Aug 15, 2019 at 06:00:41PM +0200, Michal Hocko wrote:
> >
> > > > AFAIK 'GFP_NOWAIT' is characterized by the lack of __GFP_FS and
> > > > __GFP_DIRECT_RECLAIM..
> > > >
> > > > This matches the existing test in __need_fs_reclaim() - so if you are
> > > > OK with GFP_NOFS, aka __GFP_IO which triggers try_to_compact_pages(),
> > > > allocations during OOM, then I think fs_reclaim already matches what
> > > > you described?
> > >
> > > No GFP_NOFS is equally bad. Please read my other email explaining what
> > > the oom_reaper actually requires. In short no blocking on direct or
> > > indirect dependecy on memory allocation that might sleep.
> >
> > It is much easier to follow with some hints on code, so the true
> > requirement is that the OOM repear not block on GFP_FS and GFP_IO
> > allocations, great, that constraint is now clear.
> >
> > > If you can express that in the existing lockdep machinery. All
> > > fine. But then consider deployments where lockdep is no-no because
> > > of the overhead.
> >
> > This is all for driver debugging. The point of lockdep is to find all
> > these paths without have to hit them as actual races, using debug
> > kernels.
> >
> > I don't think we need this kind of debugging on production kernels?
> >
> > > > The best we got was drivers tested the VA range and returned success
> > > > if they had no interest. Which is a big win to be sure, but it looks
> > > > like getting any more is not really posssible.
> > >
> > > And that is already a great win! Because many notifiers only do care
> > > about particular mappings. Please note that backing off unconditioanlly
> > > will simply cause that the oom reaper will have to back off not doing
> > > any tear down anything.
> >
> > Well, I'm working to propose that we do the VA range test under core
> > mmu notifier code that cannot block and then we simply remove the idea
> > of blockable from drivers using this new 'range notifier'.
> >
> > I think this pretty much solves the concern?
>
> I am not sure i follow what you propose here ? Like i pointed out in
> another email for GPU we do need to be able to sleep (we might get
> lucky and not need too but this is runtime thing) within notifier
> range_start callback. This has been something allow by notifier since
> it has been introduced in the kernel.

Sorry, I mean remove the idea of the blockable flag from the
drivers. Drivers will always be able to block, within the existing
limitation of fs_reclaim

Jason