[PATCH 1/1] mm: vmstat: Add OOM kill count in vmstat counter

From: Pintu Kumar
Date: Thu Oct 01 2015 - 07:03:27 EST


This patch maintains number of oom calls and number of oom kill
count in /proc/vmstat.
It is helpful during sluggish, aging or long duration tests.
Currently if the OOM happens, it can be only seen in kernel ring buffer.
But during long duration tests, all the dmesg and /var/log/messages* could
be overwritten.
So, just like other counters, the oom can also be maintained in
/proc/vmstat.
It can be also seen if all logs are disabled in kernel.

A snapshot of the result of over night test is shown below:
$ cat /proc/vmstat
oom_stall 610
oom_kill_count 1763

Here, oom_stall indicates that there are 610 times, kernel entered into OOM
cases. However, there were around 1763 oom killing happens.
The OOM is bad for the any system. So, this counter can help the developer
in tuning the memory requirement at least during initial bringup.

Signed-off-by: Pintu Kumar <pintu.k@xxxxxxxxxxx>
---
include/linux/vm_event_item.h | 2 ++
mm/oom_kill.c | 2 ++
mm/page_alloc.c | 2 +-
mm/vmstat.c | 2 ++
4 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 2b1cef8..ade0851 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -57,6 +57,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
#ifdef CONFIG_HUGETLB_PAGE
HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
#endif
+ OOM_STALL,
+ OOM_KILL_COUNT,
UNEVICTABLE_PGCULLED, /* culled to noreclaim list */
UNEVICTABLE_PGSCANNED, /* scanned for reclaimability */
UNEVICTABLE_PGRESCUED, /* rescued from noreclaim list */
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 03b612b..e79caed 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -570,6 +570,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
* space under its control.
*/
do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
+ count_vm_event(OOM_KILL_COUNT);
mark_oom_victim(victim);
pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n",
task_pid_nr(victim), victim->comm, K(victim->mm->total_vm),
@@ -600,6 +601,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
task_pid_nr(p), p->comm);
task_unlock(p);
do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
+ count_vm_event(OOM_KILL_COUNT);
}
rcu_read_unlock();

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9bcfd70..1d82210 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2761,7 +2761,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
schedule_timeout_uninterruptible(1);
return NULL;
}
-
+ count_vm_event(OOM_STALL);
/*
* Go through the zonelist yet one more time, keep very high watermark
* here, this is only to catch a parallel oom killing, we must fail if
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 1fd0886..f054265 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -808,6 +808,8 @@ const char * const vmstat_text[] = {
"htlb_buddy_alloc_success",
"htlb_buddy_alloc_fail",
#endif
+ "oom_stall",
+ "oom_kill_count",
"unevictable_pgs_culled",
"unevictable_pgs_scanned",
"unevictable_pgs_rescued",
--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/