mm->oom_disable_count is broken

From: Oleg Nesterov
Date: Sat Jul 30 2011 - 11:25:32 EST


On 07/30, Oleg Nesterov wrote:
>
> So I'd suggest this hack^Wpatch as a quick fix. I still think this
> code is not right, but the patch tries to keep the current logic.

And this reminds me. mm->oom_disable_count looks absolutely broken.
IIRC, I already complained but nobody replied.

What does this counter mean?

/* How many tasks sharing this mm are OOM_DISABLE */
atomic_t oom_disable_count;

tasks? processes or threads?

Lets look at oom_adjust_write(),

if (task->signal->oom_adj == OOM_DISABLE)
atomic_inc(&task->mm->oom_disable_count);

OK, so it is per-process. No matter how many threads this process
has, mm->oom_disable_count becomes 1 (ignoring CLONE_VM).


However, exit_mm() does:

if (tsk->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
atomic_dec(&mm->oom_disable_count);

but this is per-thread! it becomes zero and then negative after
pthread_exit().


copy_process()->copy_mm() seems to think it is per-thread too. But,
bad_fork_cleanup_mm:

if (p->mm) {
task_lock(p);
if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
atomic_dec(&p->mm->oom_disable_count);
task_unlock(p);
mmput(p->mm);
}

Why do we take task_lock() ? OK, oom_score_adj_write() does task_lock()
too, but this can't help in the multithreaded case? Why copy_mm() checks
->oom_score_adj lockless?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/