Re: [memcg, kmem] 58056f7750: hackbench.throughput 10.3% improvement

From: Oliver Sang
Date: Thu Nov 25 2021 - 22:20:15 EST


Hi Michal Hocko,

On Wed, Nov 24, 2021 at 06:01:12PM +0100, Michal Hocko wrote:
> On Wed 24-11-21 16:34:35, kernel test robot wrote:
> >
> >
> > Greeting,
> >
> > FYI, we noticed a 10.3% improvement of hackbench.throughput due to commit:
> >
> >
> > commit: 58056f77502f3567b760c9a8fc8d2e9081515b2d ("memcg, kmem: further deprecate kmem.limit_in_bytes")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> I am really surprised to see an improvement from this patch. I do not
> expect your benchmarking would be using kmem limit. The above patch
> hasn't really removed the page counter out of the picture so there
> shouldn't be any real reason for performance improvement. I strongly
> suspect this is just some benchmark artifact or unreliable evaluation.

Fengwei Yin helped further analyze this improvement.

The patch changed the behavior of function obj_cgroup_charge_pages. It's shown
in the perf-callstack as following line:

5.63 ± 11% -5.6 0.00 perf-profile.calltrace.cycles-pp.page_counter_try_charge.obj_cgroup_charge_pages.obj_cgroup_charge.kmem_cache_alloc_node.__alloc_skb

So Fengwei prepared a patch which reverting the changes in
obj_cgroup_charge_pages in 58056f7750 (as attached mod.patch)

by this patch, the performance is similar to 16f6bf266c, the improvement
disappear.

=========================================================================================
compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase/ucode:
gcc-9/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/hackbench/0x700001e

commit:
16f6bf266c ("mm/list_lru.c: prefer struct_size over open coded arithmetic")
58056f7750 ("memcg, kmem: further deprecate kmem.limit_in_bytes")
ae12af515d ('58056f7750' minus 'changes in obj_cgroup_charge_pages', attached mod.patch)


16f6bf266c94017c 58056f77502f3567b760c9a8fc8 ae12af515da0d557c25f86e89b0
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
124966 +8.8% 136017 ± 2% -0.1% 124791 ± 2% hackbench.throughput
...
5.41 ± 12% -5.4 0.00 +0.3 5.73 ± 13% perf-profile.calltrace.cycles-pp.page_counter_try_charge.obj_cgroup_charge_pages.obj_cgroup_charge.kmem_cache_alloc_node.__alloc_skb

detail comparison data as attached 16f6b-58056-ae12a

in brief, the result prove what we suspect. The original patch removed code
- !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
which improved the hackbench throughput. Thanks.


> --
> Michal Hocko
> SUSE Labs