Re: general protection fault in oom_unkillable_task

From: Shakeel Butt
Date: Mon Jun 17 2019 - 09:28:14 EST


On Sun, Jun 16, 2019 at 8:14 AM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On 2019/06/16 16:37, Tetsuo Handa wrote:
> > On 2019/06/16 6:33, Tetsuo Handa wrote:
> >> On 2019/06/16 3:50, Shakeel Butt wrote:
> >>>> While dump_tasks() traverses only each thread group, mem_cgroup_scan_tasks()
> >>>> traverses each thread.
> >>>
> >>> I think mem_cgroup_scan_tasks() traversing threads is not intentional
> >>> and css_task_iter_start in it should use CSS_TASK_ITER_PROCS as the
> >>> oom killer only cares about the processes or more specifically
> >>> mm_struct (though two different thread groups can have same mm_struct
> >>> but that is fine).
> >>
> >> We can't use CSS_TASK_ITER_PROCS from mem_cgroup_scan_tasks(). I've tried
> >> CSS_TASK_ITER_PROCS in an attempt to evaluate only one thread from each
> >> thread group, but I found that CSS_TASK_ITER_PROCS causes skipping whole
> >> threads in a thread group (and trivially allowing "Out of memory and no
> >> killable processes...\n" flood) if thread group leader has already exited.
> >
> > Seems that CSS_TASK_ITER_PROCS from mem_cgroup_scan_tasks() is now working.
>
>
> I found a reproducer and the commit.
>
> ----------------------------------------
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <sched.h>
> #include <sys/mman.h>
> #include <asm/unistd.h>
>
> static const unsigned long size = 1048576 * 200;
> static int thread(void *unused)
> {
> int fd = open("/dev/zero", O_RDONLY);
> char *buf = mmap(NULL, size, PROT_WRITE | PROT_READ,
> MAP_ANONYMOUS | MAP_SHARED, EOF, 0);
> sleep(1);
> read(fd, buf, size);
> return syscall(__NR_exit, 0);
> }
> int main(int argc, char *argv[])
> {
> FILE *fp;
> mkdir("/sys/fs/cgroup/memory/test1", 0755);
> fp = fopen("/sys/fs/cgroup/memory/test1/memory.limit_in_bytes", "w");
> fprintf(fp, "%lu\n", size);
> fclose(fp);
> fp = fopen("/sys/fs/cgroup/memory/test1/tasks", "w");
> fprintf(fp, "%u\n", getpid());
> fclose(fp);
> clone(thread, malloc(8192) + 4096, CLONE_SIGHAND | CLONE_THREAD | CLONE_VM, NULL);
> return syscall(__NR_exit, 0);
> }
> ----------------------------------------
>
> Here is a patch to use CSS_TASK_ITER_PROCS.
>
> From 415e52cf55bc4ad931e4f005421b827f0b02693d Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Date: Mon, 17 Jun 2019 00:09:38 +0900
> Subject: [PATCH] mm: memcontrol: Use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks().
>
> Since commit c03cd7738a83b137 ("cgroup: Include dying leaders with live
> threads in PROCS iterations") corrected how CSS_TASK_ITER_PROCS works,
> mem_cgroup_scan_tasks() can use CSS_TASK_ITER_PROCS in order to check
> only one thread from each thread group.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>

Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx>

Why not add the reproducer in the commit message?

> ---
> mm/memcontrol.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ba9138a..b09ff45 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1163,7 +1163,7 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
> struct css_task_iter it;
> struct task_struct *task;
>
> - css_task_iter_start(&iter->css, 0, &it);
> + css_task_iter_start(&iter->css, CSS_TASK_ITER_PROCS, &it);
> while (!ret && (task = css_task_iter_next(&it)))
> ret = fn(task, arg);
> css_task_iter_end(&it);
> --
> 1.8.3.1