Re: [RFC] oom-kill: give the dying task a higher priority

From: KOSAKI Motohiro
Date: Fri May 28 2010 - 02:38:33 EST


> * KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> [2010-05-28 13:46:53]:
>
> > > * Luis Claudio R. Goncalves <lclaudio@xxxxxxxx> [2010-05-28 00:51:47]:
> > >
> > > > @@ -382,6 +382,8 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
> > > > */
> > > > static void __oom_kill_task(struct task_struct *p, int verbose)
> > > > {
> > > > + struct sched_param param;
> > > > +
> > > > if (is_global_init(p)) {
> > > > WARN_ON(1);
> > > > printk(KERN_WARNING "tried to kill init!\n");
> > > > @@ -413,8 +415,9 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
> > > > */
> > > > p->rt.time_slice = HZ;
> > > > set_tsk_thread_flag(p, TIF_MEMDIE);
> > > > -
> > > > force_sig(SIGKILL, p);
> > > > + param.sched_priority = MAX_RT_PRIO-1;
> > > > + sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
> > > > }
> > > >
> > >
> > > I would like to understand the visible benefits of this patch. Have
> > > you seen an OOM kill tasked really get bogged down. Should this task
> > > really be competing with other important tasks for run time?
> >
> > What you mean important? Until OOM victim task exit completely, the system have no memory.
> > all of important task can't do anything.
> >
> > In almost kernel subsystems, automatically priority boost is really bad idea because
> > it may break RT task's deterministic behavior. but OOM is one of exception. The deterministic
> > was alread broken by memory starvation.
> >
>
> I am still not convinced, specially if we are running under mem
> cgroup. Even setting SCHED_FIFO does not help, you could have other
> things like cpusets that might restrict the CPUs you can run on, or
> any other policy and we could end up contending anyway with other
> SCHED_FIFO tasks.

Ah, right you are. I had missed mem-cgroup.
But I think memcgroup also don't need following two boost. Can we get rid of it?

p->rt.time_slice = HZ;
set_tsk_thread_flag(p, TIF_MEMDIE);


I mean we need distinguish global oom and memcg oom, perhapls.


> > That's the reason I acked it.
>
> If we could show faster recovery from OOM or anything else, I would be
> more convinced.






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/