Re: [PATCH for 3.2] memcg: do not trap chargers with full callstackon OOM

From: Michal Hocko
Date: Tue Jul 09 2013 - 09:10:11 EST


On Tue 09-07-13 15:08:08, Michal Hocko wrote:
> On Tue 09-07-13 15:00:17, Michal Hocko wrote:
> > On Mon 24-06-13 16:13:45, Johannes Weiner wrote:
> > > Hi guys,
> > >
> > > On Sat, Jun 22, 2013 at 10:09:58PM +0200, azurIt wrote:
> > > > >> But i'm sure of one thing - when problem occurs, nothing is able to
> > > > >> access hard drives (every process which tries it is freezed until
> > > > >> problem is resolved or server is rebooted).
> > > > >
> > > > >I would be really interesting to see what those tasks are blocked on.
> > > >
> > > > I'm trying to get it, stay tuned :)
> > > >
> > > > Today i noticed one bug, not 100% sure it is related to 'your' patch
> > > > but i didn't seen this before. I noticed that i have lots of cgroups
> > > > which cannot be removed - if i do 'rmdir <cgroup_directory>', it
> > > > just hangs and never complete. Even more, it's not possible to
> > > > access the whole cgroup filesystem until i kill that rmdir
> > > > (anything, which tries it, just hangs). All unremoveable cgroups has
> > > > this in 'memory.oom_control': oom_kill_disable 0 under_oom 1
> > >
> > > Somebody acquires the OOM wait reference to the memcg and marks it
> > > under oom but then does not call into mem_cgroup_oom_synchronize() to
> > > clean up. That's why under_oom is set and the rmdir waits for
> > > outstanding references.
> > >
> > > > And, yes, 'tasks' file is empty.
> > >
> > > It's not a kernel thread that does it because all kernel-context
> > > handle_mm_fault() are annotated properly, which means the task must be
> > > userspace and, since tasks is empty, have exited before synchronizing.
> >
> > Yes, well spotted. I have missed that while reviewing your patch.
> > The follow up fix looks correct.
>
> Hmm, I guess you wanted to remove !(fault & VM_FAULT_ERROR) test as well
> otherwise the else BUG() path would be unreachable and we wouldn't know
> that something fishy is going on.

No, scratch it! We need it for VM_FAULT_RETRY. Sorry about the noise.

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/