Re: [PATCH v15 1/2] Reorganize the oom report in dump_header

From: Michal Hocko
Date: Thu Nov 22 2018 - 08:38:11 EST


On Wed 21-11-18 19:29:58, ufo19890607@xxxxxxxxx wrote:
> From: yuzhoujian <yuzhoujian@xxxxxxxxxxxxxxx>
>
> OOM report contains several sections. The first one is the allocation
> context that has triggered the OOM. Then we have cpuset context
> followed by the stack trace of the OOM path. The tird one is the OOM
> memory information. Followed by the current memory state of all system
> tasks. At last, we will show oom eligible tasks and the information
> about the chosen oom victim.
>
> One thing that makes parsing more awkward than necessary is that we do
> not have a single and easily parsable line about the oom context. This
> patch is reorganizing the oom report to
> 1) who invoked oom and what was the allocation request
> [ 515.902945] tuned invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
>
> 2) OOM stack trace
> [ 515.904273] CPU: 24 PID: 1809 Comm: tuned Not tainted 4.20.0-rc3+ #3
> [ 515.905518] Hardware name: Inspur SA5212M4/YZMB-00370-107, BIOS 4.1.10 11/14/2016
> [ 515.906821] Call Trace:
> [ 515.908062] dump_stack+0x5a/0x73
> [ 515.909311] dump_header+0x55/0x28c
> [ 515.914260] oom_kill_process+0x2d8/0x300
> [ 515.916708] out_of_memory+0x145/0x4a0
> [ 515.917932] __alloc_pages_slowpath+0x7d2/0xa16
> [ 515.919157] __alloc_pages_nodemask+0x277/0x290
> [ 515.920367] filemap_fault+0x3d0/0x6c0
> [ 515.921529] ? filemap_map_pages+0x2b8/0x420
> [ 515.922709] ext4_filemap_fault+0x2c/0x40 [ext4]
> [ 515.923884] __do_fault+0x20/0x80
> [ 515.925032] __handle_mm_fault+0xbc0/0xe80
> [ 515.926195] handle_mm_fault+0xfa/0x210
> [ 515.927357] __do_page_fault+0x233/0x4c0
> [ 515.928506] do_page_fault+0x32/0x140
> [ 515.929646] ? page_fault+0x8/0x30
> [ 515.930770] page_fault+0x1e/0x30
>
> 3) OOM memory information
> [ 515.958093] Mem-Info:
> [ 515.959647] active_anon:26501758 inactive_anon:1179809 isolated_anon:0
> active_file:4402672 inactive_file:483963 isolated_file:1344
> unevictable:0 dirty:4886753 writeback:0 unstable:0
> slab_reclaimable:148442 slab_unreclaimable:18741
> mapped:1347 shmem:1347 pagetables:58669 bounce:0
> free:88663 free_pcp:0 free_cma:0
> ...
>
> 4) current memory state of all system tasks
> [ 516.079544] [ 744] 0 744 9211 1345 114688 82 0 systemd-journal
> [ 516.082034] [ 787] 0 787 31764 0 143360 92 0 lvmetad
> [ 516.084465] [ 792] 0 792 10930 1 110592 208 -1000 systemd-udevd
> [ 516.086865] [ 1199] 0 1199 13866 0 131072 112 -1000 auditd
> [ 516.089190] [ 1222] 0 1222 31990 1 110592 157 0 smartd
> [ 516.091477] [ 1225] 0 1225 4864 85 81920 43 0 irqbalance
> [ 516.093712] [ 1226] 0 1226 52612 0 258048 426 0 abrtd
> [ 516.112128] [ 1280] 0 1280 109774 55 299008 400 0 NetworkManager
> [ 516.113998] [ 1295] 0 1295 28817 37 69632 24 0 ksmtuned
> [ 516.144596] [ 10718] 0 10718 2622484 1721372 15998976 267219 0 panic
> [ 516.145792] [ 10719] 0 10719 2622484 1164767 9818112 53576 0 panic
> [ 516.146977] [ 10720] 0 10720 2622484 1174361 9904128 53709 0 panic
> [ 516.148163] [ 10721] 0 10721 2622484 1209070 10194944 54824 0 panic
> [ 516.149329] [ 10722] 0 10722 2622484 1745799 14774272 91138 0 panic
>
> 5) oom context (contrains and the chosen victim).
> oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,task=panic,pid=10737,uid=0
>
> An admin can easily get the full oom context at a single line which
> makes parsing much easier.
>
> Signed-off-by: yuzhoujian <yuzhoujian@xxxxxxxxxxxxxxx>

Looks good, finally
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
--
Michal Hocko
SUSE Labs