Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

From: Edward Chron
Date: Wed Aug 21 2019 - 19:12:33 EST


On Wed, Aug 21, 2019 at 12:47 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Wed 21-08-19 00:19:37, David Rientjes wrote:
> > On Wed, 21 Aug 2019, Michal Hocko wrote:
> >
> > > > vm.oom_dump_tasks is pretty useful, however, so it's curious why you
> > > > haven't left it enabled :/
> > >
> > > Because it generates a lot of output potentially. Think of a workload
> > > with too many tasks which is not uncommon.
> >
> > Probably better to always print all the info for the victim so we don't
> > need to duplicate everything between dump_tasks() and dump_oom_summary().
>
> I believe that the motivation was to have a one line summary that is already
> parsed by log consumers. And that is in __oom_kill_process one.
>

Yes the motivation was one line summary that the OOM Killed Process
message supplies along
with the fact it is error priority as I mentioned. It is a very
desirable place to put summarized
information.

> Also I do not think this patch improves things much for two reasons
> at leasts a) it doesn't really give you the whole list of killed tasks
> (this might be the whole memcg) and b) we already do have most important
> information in __oom_kill_process. If something is missing there I do
> not see a strong reason we cannot add it there. Like in this case.
>

This is a good point.

Additionally (which you know, but mentioning for reference) the OOM
output used to look like this:

Nov 14 15:23:48 oldserver kernel: [337631.991218] Out of memory: Kill
process 19961 (python) score 17 or sacrifice child
Nov 14 15:23:48 oldserver kernel: [337631.991237] Killed process 31357
(sh) total-vm:5400kB, anon-rss:252kB, file-rss:4kB, shmem-rss:0kB

It now looks like this with 5.3.0-rc5 (minus the oom_score_adj):

Jul 22 10:42:40 newserver kernel:
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-10383.slice/user@xxxxxxxxxxxxx,task=oomprocs,pid=3035,uid=10383
Jul 22 10:42:40 newserver kernel: Out of memory: Killed process 3035
(oomprocs) total-vm:1056800kB, anon-rss:8kB, file-rss:4kB,
shmem-rss:0kB
Jul 22 10:42:40 newserver kernel: oom_reaper: reaped process 3035
(oomprocs), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

The old output did explain that a oom_score of 17 must have either
tied for highest or was the highest.
This did document why OOM selected the process it did, even if ends up
killing the related sh process.

With the newer format that added constraint message, it does provide
uid which can be helpful and
the oom_reaper showing that the memory was reclaimed is certainly reassuring.

My understanding now is that printing the oom_score is discouraged.
This seems unfortunate. The oom_score_adj can be adjusted
appropriately if oom_score is known.
So It would be useful to have both.

But at least if oom_score_adj is printed you can confirm the value at
the time of the OOM event.

Thank-you,
-Edward Chron
Arista Networks

> > Edward, how about this?
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -420,11 +420,17 @@ static int dump_task(struct task_struct *p, void *arg)
> > * State information includes task's pid, uid, tgid, vm size, rss,
> > * pgtables_bytes, swapents, oom_score_adj value, and name.
> > */
> > -static void dump_tasks(struct oom_control *oc)
> > +static void dump_tasks(struct oom_control *oc, struct task_struct *victim)
> > {
> > pr_info("Tasks state (memory values in pages):\n");
> > pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
> >
> > + /* If vm.oom_dump_tasks is disabled, only show the victim */
> > + if (!sysctl_oom_dump_tasks) {
> > + dump_task(victim, oc);
> > + return;
> > + }
> > +
> > if (is_memcg_oom(oc))
> > mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
> > else {
> > @@ -465,8 +471,8 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
> > if (is_dump_unreclaim_slabs())
> > dump_unreclaimable_slab();
> > }
> > - if (sysctl_oom_dump_tasks)
> > - dump_tasks(oc);
> > + if (p || sysctl_oom_dump_tasks)
> > + dump_tasks(oc, p);
> > if (p)
> > dump_oom_summary(oc, p);
> > }
>
> --
> Michal Hocko
> SUSE Labs