Re: [patch v2] mm, oom: normalize oom scores to oom_score_adj scaleonly for userspace

From: David Rientjes
Date: Thu May 24 2012 - 02:02:09 EST


On Wed, 23 May 2012, Andrew Morton wrote:

> > @@ -367,12 +354,13 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
> > }
> >
> > points = oom_badness(p, memcg, nodemask, totalpages);
> > - if (points > *ppoints) {
> > + if (points > chosen_points) {
> > chosen = p;
> > - *ppoints = points;
> > + chosen_points = points;
> > }
> > } while_each_thread(g, p);
> >
> > + *ppoints = chosen_points * 1000 / totalpages;
> > return chosen;
> > }
> >
>
> It's still not obvious that we always avoid the divide-by-zero here.
> If there's some weird way of convincing constrained_alloc() to look at
> an empty nodemask, or a nodemask which covers only empty nodes then
> blam.
>
> Now, it's probably the case that this is a can't-happen but that
> guarantee would be pretty convoluted and fragile?
>

It can only happen for memcg with a zero limit, something I tried to
prevent by not allowing tasks to be attached to the memcgs with such a
limit in a different patch but you didn't like that :)

So I fixed it in this patch with this:

@@ -572,7 +560,7 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
}

check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL);
- limit = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT;
+ limit = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1;
read_lock(&tasklist_lock);
p = select_bad_process(&points, limit, memcg, NULL, false);
if (p && PTR_ERR(p) != -1UL)

Cpusets do not allow threads to be attached without a set of mems or the
final mem in a cpuset to be removed while tasks are still attached. The
page allocator certainly wouldn't be calling the oom killer for a set of
zones that span no pages.

Any suggestion on where to put the check for !totalpages so it's easier to
understand?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/