Re: Misleading OOM messages

From: David Rientjes
Date: Thu May 14 2009 - 17:12:43 EST


On Thu, 14 May 2009, Dave Hansen wrote:

> On Thu, 2009-05-14 at 13:49 -0700, David Rientjes wrote:
> > I think switching all the oom killer messages to be "no available memory"
> > as it is in the MPOL_BIND case would be the best alternative. We
> > currently use "out of memory" even for cpusets, for example, when it
> > happens because it cannot accommodate any more hardwall allocations while
> > there may be an abundance of memory elsewhere that it cannot access. I
> > also think "no available memory" makes more sense than "out of memory"
> > when describing situations where we're at or below the minimum watermarks
> > for all allowable zones. Either that or "no allowable memory".
>
> How about something like this to start? We can "mv mm/oom_kill.c
> mm/nam_kill.c" later. ;)
>

I don't think it's that easy, I think we need to indicate what the set of
available memory is. For instance, we need to show the nodes that
MPOL_BIND is allowed to access, what the hard limit for the memory cgroup
is, what current->mems_allowed is for cpuset ooms, why we can't allocate
in ZONE_DMA or ZONE_DMA32 because of lowmem_reserve_ratio, etc.

oom_kill_allocating_task, for instance, can indicate all of the above
scenarios but it doesn't show the cause for the oom like others do, even
with your patch. It simply shows that
/proc/sys/vm/oom_kill_allocating_task was set to avoid the expensive
tasklist scan.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/