Re: [PATCH mm-unstable] lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default

From: Hyeonggon Yoo
Date: Wed Jan 25 2023 - 01:42:20 EST


From: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
To: Michal Hocko <mhocko@xxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>,
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>,
Christoph Lameter <cl@xxxxxxxxx>, Pekka Enberg <penberg@xxxxxxxxxx>,
David Rientjes <rientjes@xxxxxxxxxx>,
Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>,
Roman Gushchin <roman.gushchin@xxxxxxxxx>,
Ingo Molnar <mingo@xxxxxxxxxx>,
Johannes Weiner <hannes@xxxxxxxxxxx>,
Shakeel Butt <shakeelb@xxxxxxxxxx>,
Muchun Song <muchun.song@xxxxxxxxx>,
Matthew Wilcox <willy@xxxxxxxxxxxxx>, linux-mm@xxxxxxxxx,
linux-kernel@xxxxxxxxxxxxxxx, Peter Zijlstra <peterz@xxxxxxxxxxxxx>,
Juri Lelli <juri.lelli@xxxxxxxxxx>,
Vincent Guittot <vincent.guittot@xxxxxxxxxx>,
Dietmar Eggemann <dietmar.eggemann@xxxxxxx>,
Steven Rostedt <rostedt@xxxxxxxxxxx>,
Ben Segall <bsegall@xxxxxxxxxx>,
Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>,
Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>,
Valentin Schneider <vschneid@xxxxxxxxxx>,
Dennis Zhou <dennis@xxxxxxxxxx>, Tejun Heo <tj@xxxxxxxxxx>
Bcc:
Subject: Re: [PATCH mm-unstable] lib/Kconfig.debug: do not enable
DEBUG_PREEMPT by default
Reply-To:
In-Reply-To: <Y85MNmZDc5czMRUJ@xxxxxxxxxxxxxx>

On Mon, Jan 23, 2023 at 09:58:30AM +0100, Michal Hocko wrote:
> On Sat 21-01-23 20:54:15, Hyeonggon Yoo wrote:
> > On Sat, Jan 21, 2023 at 12:29:44PM +0100, Vlastimil Babka wrote:
> > > On 1/21/23 04:39, Hyeonggon Yoo wrote:
> > > > In workloads where this_cpu operations are frequently performed,
> > > > enabling DEBUG_PREEMPT may result in significant increase in
> > > > runtime overhead due to frequent invocation of
> > > > __this_cpu_preempt_check() function.
> > > >
> > > > This can be demonstrated through benchmarks such as hackbench where this
> > > > configuration results in a 10% reduction in performance, primarily due to
> > > > the added overhead within memcg charging path.
> > > >
> > > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware
> > > > of its potential impact on performance in some workloads.
> > > >
> > > > hackbench-process-sockets
> > > > debug_preempt no_debug_preempt
> > > > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%*
> > > > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%*
> > > > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%*
> > > > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%*
> > > > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%*
> > > > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%*
> > > > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%*
> > > > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%*
> > > > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%*

Hello Michal, thanks for looking at this.

> Do you happen to have any perf data collected during those runs? I
> would be interested in the memcg side of things. Maybe we can do
> something better there.

Yes, below is performance data I've collected.

6.1.8-debug-preempt-dirty
=========================
Overhead Command Shared Object Symbol
+ 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled
+ 7.33% hackbench [kernel.vmlinux] [k] copy_user_enhanced_fast_string
+ 7.32% hackbench [kernel.vmlinux] [k] mod_objcg_state
3.55% hackbench [kernel.vmlinux] [k] refill_obj_stock
3.39% hackbench [kernel.vmlinux] [k] debug_smp_processor_id
2.97% hackbench [kernel.vmlinux] [k] memset_erms
2.55% hackbench [kernel.vmlinux] [k] __check_object_size
+ 2.36% hackbench [kernel.vmlinux] [k] native_queued_spin_lock_slowpath
1.76% hackbench [kernel.vmlinux] [k] unix_stream_read_generic
1.64% hackbench [kernel.vmlinux] [k] __slab_free
1.58% hackbench [kernel.vmlinux] [k] unix_stream_sendmsg
1.46% hackbench [kernel.vmlinux] [k] memcg_slab_post_alloc_hook
1.35% hackbench [kernel.vmlinux] [k] vfs_write
1.33% hackbench [kernel.vmlinux] [k] vfs_read
1.28% hackbench [kernel.vmlinux] [k] __alloc_skb
1.18% hackbench [kernel.vmlinux] [k] sock_read_iter
1.16% hackbench [kernel.vmlinux] [k] obj_cgroup_charge
1.16% hackbench [kernel.vmlinux] [k] entry_SYSCALL_64
1.14% hackbench [kernel.vmlinux] [k] sock_write_iter
1.12% hackbench [kernel.vmlinux] [k] skb_release_data
1.08% hackbench [kernel.vmlinux] [k] sock_wfree
1.07% hackbench [kernel.vmlinux] [k] cache_from_obj
0.96% hackbench [kernel.vmlinux] [k] unix_destruct_scm
0.95% hackbench [kernel.vmlinux] [k] kmem_cache_free
0.94% hackbench [kernel.vmlinux] [k] __kmem_cache_alloc_node
0.92% hackbench [kernel.vmlinux] [k] kmem_cache_alloc_node
0.89% hackbench [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.84% hackbench [kernel.vmlinux] [k] __x86_indirect_thunk_array
0.84% hackbench libc.so.6 [.] write
0.81% hackbench [kernel.vmlinux] [k] exit_to_user_mode_prepare
0.76% hackbench libc.so.6 [.] read
0.75% hackbench [kernel.vmlinux] [k] syscall_trace_enter.constprop.0
0.75% hackbench [kernel.vmlinux] [k] preempt_count_add
0.74% hackbench [kernel.vmlinux] [k] cmpxchg_double_slab.constprop.0.isra.0
0.69% hackbench [kernel.vmlinux] [k] get_partial_node
0.69% hackbench [kernel.vmlinux] [k] __virt_addr_valid
0.69% hackbench [kernel.vmlinux] [k] __rcu_read_unlock
0.65% hackbench [kernel.vmlinux] [k] get_obj_cgroup_from_current
0.63% hackbench [kernel.vmlinux] [k] __kmem_cache_free
0.62% hackbench [kernel.vmlinux] [k] entry_SYSRETQ_unsafe_stack
0.60% hackbench [kernel.vmlinux] [k] __rcu_read_lock
0.59% hackbench [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare
0.54% hackbench [kernel.vmlinux] [k] __unfreeze_partials
0.53% hackbench [kernel.vmlinux] [k] check_stack_object
0.52% hackbench [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
0.51% hackbench [kernel.vmlinux] [k] security_file_permission
0.50% hackbench [kernel.vmlinux] [k] __x64_sys_write
0.49% hackbench [kernel.vmlinux] [k] bpf_lsm_file_permission
0.48% hackbench [kernel.vmlinux] [k] ___slab_alloc
0.46% hackbench [kernel.vmlinux] [k] __check_heap_object

and attached flamegraph-6.1.8-debug-preempt-dirty.svg.

6.1.8 (no debug preempt)
========================
Overhead Command Shared Object Symbol
+ 10.96% hackbench [kernel.vmlinux] [k] mod_objcg_state
+ 8.16% hackbench [kernel.vmlinux] [k] copy_user_enhanced_fast_string
3.29% hackbench [kernel.vmlinux] [k] memset_erms
3.07% hackbench [kernel.vmlinux] [k] __slab_free
2.89% hackbench [kernel.vmlinux] [k] refill_obj_stock
2.82% hackbench [kernel.vmlinux] [k] __check_object_size
+ 2.72% hackbench [kernel.vmlinux] [k] native_queued_spin_lock_slowpath
1.96% hackbench [kernel.vmlinux] [k] __x86_indirect_thunk_rax
1.88% hackbench [kernel.vmlinux] [k] memcg_slab_post_alloc_hook
1.69% hackbench [kernel.vmlinux] [k] __rcu_read_unlock
1.54% hackbench [kernel.vmlinux] [k] __alloc_skb
1.53% hackbench [kernel.vmlinux] [k] unix_stream_sendmsg
1.46% hackbench [kernel.vmlinux] [k] kmem_cache_free
1.44% hackbench [kernel.vmlinux] [k] vfs_write
1.43% hackbench [kernel.vmlinux] [k] vfs_read
1.33% hackbench [kernel.vmlinux] [k] unix_stream_read_generic
1.31% hackbench [kernel.vmlinux] [k] sock_write_iter
1.27% hackbench [kernel.vmlinux] [k] kmalloc_slab
1.22% hackbench [kernel.vmlinux] [k] __rcu_read_lock
1.20% hackbench [kernel.vmlinux] [k] sock_read_iter
1.18% hackbench [kernel.vmlinux] [k] __entry_text_start
1.15% hackbench [kernel.vmlinux] [k] kmem_cache_alloc_node
1.12% hackbench [kernel.vmlinux] [k] unix_stream_recvmsg
1.10% hackbench [kernel.vmlinux] [k] obj_cgroup_charge
0.98% hackbench [kernel.vmlinux] [k] __kmem_cache_alloc_node
0.97% hackbench libc.so.6 [.] write
0.91% hackbench [kernel.vmlinux] [k] exit_to_user_mode_prepare
0.88% hackbench [kernel.vmlinux] [k] __kmem_cache_free
0.87% hackbench [kernel.vmlinux] [k] syscall_trace_enter.constprop.0
0.86% hackbench [kernel.vmlinux] [k] __kmalloc_node_track_caller
0.84% hackbench libc.so.6 [.] read
0.81% hackbench [kernel.vmlinux] [k] __lock_text_start
0.80% hackbench [kernel.vmlinux] [k] cache_from_obj
0.74% hackbench [kernel.vmlinux] [k] get_obj_cgroup_from_current
0.73% hackbench [kernel.vmlinux] [k] entry_SYSRETQ_unsafe_stack
0.72% hackbench [kernel.vmlinux] [k] unix_destruct_scm
0.70% hackbench [kernel.vmlinux] [k] get_partial_node
0.69% hackbench [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare
0.65% hackbench [kernel.vmlinux] [k] kfree
0.63% hackbench [kernel.vmlinux] [k] __unfreeze_partials
0.60% hackbench [kernel.vmlinux] [k] cmpxchg_double_slab.constprop.0.isra.0
0.58% hackbench [kernel.vmlinux] [k] skb_release_data
0.56% hackbench [kernel.vmlinux] [k] __virt_addr_valid
0.56% hackbench [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
0.56% hackbench [kernel.vmlinux] [k] __check_heap_object
0.55% hackbench [kernel.vmlinux] [k] sock_wfree
0.54% hackbench [kernel.vmlinux] [k] __audit_syscall_entry
0.53% hackbench [kernel.vmlinux] [k] ___slab_alloc
0.53% hackbench [kernel.vmlinux] [k] check_stack_object
0.52% hackbench [kernel.vmlinux] [k] bpf_lsm_file_permission

and attached flamegraph-6.1.8.svg.

If you need more information, feel free to ask.

--
Thanks,
Hyeonggon

> --
> Michal Hocko
> SUSE Labs

Attachment: flamegraph-6.1.8-debug-preempt-dirty.svg
Description: image/svg

Attachment: flamegraph-6.1.8.svg
Description: image/svg