Re: [RFC] oom-kill: give the dying task a higher priority

From: Minchan Kim
Date: Sun May 30 2010 - 11:09:22 EST


On Fri, May 28, 2010 at 01:48:26PM -0300, Luis Claudio R. Goncalves wrote:
> On Sat, May 29, 2010 at 12:45:49AM +0900, Minchan Kim wrote:
> | On Fri, May 28, 2010 at 12:28:42PM -0300, Luis Claudio R. Goncalves wrote:
> | > On Sat, May 29, 2010 at 12:12:49AM +0900, Minchan Kim wrote:
> ...
> | > | I think highest RT proirity ins't good solution.
> | > | As I mentiond, Some RT functions don't want to be preempted by other processes
> | > | which cause memory pressure. It makes RT task broken.
> | >
> | > For the RT case, if you reached a system OOM situation, your determinism has
> | > already been hurt. If the memcg OOM happens on the same memcg your RT task
> | > is - what will probably be the case most of time - again, the determinism
> | > has deteriorated. For both these cases, giving the dying task SCHED_FIFO
> | > MAX_RT_PRIO-1 means a faster recovery.
> |
> | What I want to say is that determinisic has no relation with OOM.
> | Why is some RT task affected by other process's OOM?
> |
> | Of course, if system has no memory, it is likely to slow down RT task.
> | But it's just only thought. If some task scheduled just is exit, we don't need
> | to raise OOMed task's priority.
> |
> | But raising min rt priority on your patch was what I want.
> | It doesn't preempt any RT task.
> |
> | So until now, I have made noise about your patch.
> | Really, sorry for that.
> | I don't have any objection on raising priority part from now on.
>
> This is the third version of the patch, factoring in your input along with
> Peter's comment. Basically the same patch, but using the lowest RT priority
> to boost the dying task.
>
> Thanks again for reviewing and commenting.
> Luis
>
> oom-killer: give the dying task rt priority (v3)
>
> Give the dying task RT priority so that it can be scheduled quickly and die,
> freeing needed memory.
>
> Signed-off-by: Luis Claudio R. Gonçalves <lgoncalv@xxxxxxxxxx>
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 84bbba2..2b0204f 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -266,6 +266,8 @@ static struct task_struct *select_bad_process(unsigned long *ppoints)
> */
> static void __oom_kill_task(struct task_struct *p, int verbose)
> {
> + struct sched_param param;
> +
> if (is_global_init(p)) {
> WARN_ON(1);
> printk(KERN_WARNING "tried to kill init!\n");
> @@ -288,6 +290,8 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
> * exit() and clear out its resources quickly...
> */
> p->time_slice = HZ;
> + param.sched_priority = MAX_RT_PRIO-10;

I can't understand your point, still.
Why you put the priority as "MAX_RT_PRIO - 10"?
What I and peter mentioned was "1" which is lowest RT priority.

> + sched_setscheduler(p, SCHED_FIFO, &param);

Why do you change sched_setscheduler_nocheck with sched_set_scheduler?
It means you can't boost prioity if current context doesn't have permission.
Is it a your intention?

> set_tsk_thread_flag(p, TIF_MEMDIE);
>
> force_sig(SIGKILL, p);
> --
> [ Luis Claudio R. Goncalves Bass - Gospel - RT ]
> [ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
>
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/