Re: [PATCH] mm, oom: allow oom reaper to race with exit_mmap

From: Michal Hocko
Date: Tue Jul 25 2017 - 10:26:25 EST


On Tue 25-07-17 17:17:23, Kirill A. Shutemov wrote:
[...]
> Below are numbers for the same test case, but from bigger machine (48
> threads, 64GiB of RAM).
>
> v4.13-rc2:
>
> Performance counter stats for './a.sh 100000' (5 runs):
>
> 159857.233790 task-clock:u (msec) # 1.000 CPUs utilized ( +- 3.21% )
> 0 context-switches:u # 0.000 K/sec
> 0 cpu-migrations:u # 0.000 K/sec
> 8,761,843 page-faults:u # 0.055 M/sec ( +- 0.64% )
> 38,725,763,026 cycles:u # 0.242 GHz ( +- 0.18% )
> 272,691,643,016 stalled-cycles-frontend:u # 704.16% frontend cycles idle ( +- 3.16% )
> 22,221,416,575 instructions:u # 0.57 insn per cycle
> # 12.27 stalled cycles per insn ( +- 0.00% )
> 5,306,829,649 branches:u # 33.197 M/sec ( +- 0.00% )
> 240,783,599 branch-misses:u # 4.54% of all branches ( +- 0.15% )
>
> 159.808721098 seconds time elapsed ( +- 3.15% )
>
> v4.13-rc2 + the patch:
>
> Performance counter stats for './a.sh 100000' (5 runs):
>
> 167628.094556 task-clock:u (msec) # 1.007 CPUs utilized ( +- 1.63% )
> 0 context-switches:u # 0.000 K/sec
> 0 cpu-migrations:u # 0.000 K/sec
> 8,838,314 page-faults:u # 0.053 M/sec ( +- 0.26% )
> 38,862,240,137 cycles:u # 0.232 GHz ( +- 0.10% )
> 282,105,057,553 stalled-cycles-frontend:u # 725.91% frontend cycles idle ( +- 1.64% )
> 22,219,273,623 instructions:u # 0.57 insn per cycle
> # 12.70 stalled cycles per insn ( +- 0.00% )
> 5,306,165,194 branches:u # 31.654 M/sec ( +- 0.00% )
> 240,473,075 branch-misses:u # 4.53% of all branches ( +- 0.07% )
>
> 166.497005412 seconds time elapsed ( +- 1.61% )
>
> IMO, there is something to think about. ~4% slowdown is not insignificant.
> I expect effect to be bigger for larger machines.

Thanks for retesting Kirill. Are those numbers stable over runs? E.g.
the run without the patch has ~3% variance while the one with the patch
has it smaller. This sounds suspicious to me. There shouldn't be any
lock contention (except for the oom killer) so the lock shouldn't make
any difference wrt. variability.

Also I was about to post a more targeted test. Could you try it with it
as well, please?

--
Michal Hocko
SUSE Labs