Re: [RFC 0/4] OOM vs PM freezer fixes

From: Michal Hocko
Date: Tue Nov 18 2014 - 16:08:41 EST


On Fri 14-11-14 15:14:19, Tejun Heo wrote:
> On Wed, Nov 12, 2014 at 07:58:48PM +0100, Michal Hocko wrote:
> > Hi,
> > here is another take at OOM vs. PM freezer interaction fixes/cleanups.
> > First three patches are fixes for an unlikely cases when OOM races with
> > the PM freezer which should be closed completely finally. The last patch
> > is a simple code enhancement which is not needed strictly speaking but
> > it is nice to have IMO.
> >
> > Both OOM killer and PM freezer are quite subtle so I hope I haven't
> > missing anything. Any feedback is highly appreciated. I am also
> > interested about feedback for the used approach. To be honest I am not
> > really happy about spreading TIF_MEMDIE checks into freezer (patch 1)
> > but I didn't find any other way for detecting OOM killed tasks.
>
> I really don't get why this is structured this way. Can't you just do
> the following?

Well, I liked how simple this was and localized at the only place which
matters. When I was thinking about a solution which you are describing
below it was more complicated and more subtle (e.g. waiting for an OOM
victim might be tricky if it stumbles over a lock which is held by a
frozen thread which uses try_to_freeze_unsafe). Anyway I gave it another
try and will post the two patches as a reply to this email. I hope the
both interface and implementation is cleaner.

> 1. Freeze all freezables. Don't worry about PF_MEMDIE.
>
> 2. Disable OOM killer. This should be contained in the OOM killer
> proper. Lock out the OOM killer and disable it.
>
> 3. At this point, we know that no one will create more freezable
> threads and no new process will be OOM kliled. Wait till there's
> no process w/ PF_MEMDIE set.
>
> There's no reason to lock out or disable OOM killer while the system
> is not in the quiescent state, which is a big can of worms. Bring
> down the system to the quiescent state, disable the OOM killer and
> then drain PF_MEMDIEs.

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/