Re: [PATCH] procfs: add syscall statistics

From: Greg KH
Date: Fri May 27 2022 - 08:21:42 EST


On Fri, May 27, 2022 at 07:09:59PM +0800, Zhang Yuchen wrote:
> Add /proc/syscalls to display percpu syscall count.
>
> We need a less resource-intensive way to count syscall per cpu
> for system problem location.

Why?

How is this less resource intensive than perf?

> There is a similar utility syscount in the BCC project, but syscount
> has a high performance cost.

What is that cost?

> The following is a comparison on the same machine, using UnixBench
> System Call Overhead:
>
> ┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
> ┃ Change ┃ Unixbench Score ┃ Loss ┃
> ┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
> │ no change │ 1072.6 │ --- │
> │ syscall count │ 982.5 │ 8.40% │
> │ bpf syscount │ 614.2 │ 42.74% │
> └───────────────┴─────────────────┴────────┘

Again, what about perf?

> UnixBench System Call Use sys_gettid to test, this system call only reads
> one variable, so the performance penalty seems large. When tested with
> fork, the test scores were almost the same.
>
> So the conclusion is that it does not have a significant impact on system
> call performance.

8% is huge for a system-wide decrease in performance. Who would ever
use this?

> This function depends on CONFIG_FTRACE_SYSCALLS because the system call
> number is stored in syscall_metadata.
>
> Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@xxxxxxxxxxxxx>
> ---
> Documentation/filesystems/proc.rst | 28 +++++++++
> arch/arm64/include/asm/syscall_wrapper.h | 2 +-
> arch/s390/include/asm/syscall_wrapper.h | 4 +-
> arch/x86/include/asm/syscall_wrapper.h | 2 +-
> fs/proc/Kconfig | 7 +++
> fs/proc/Makefile | 1 +
> fs/proc/syscall.c | 79 ++++++++++++++++++++++++
> include/linux/syscalls.h | 51 +++++++++++++--
> 8 files changed, 165 insertions(+), 9 deletions(-)
> create mode 100644 fs/proc/syscall.c
>
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 1bc91fb8c321..80394a98a192 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -686,6 +686,7 @@ files are there, and which are missing.
> fs File system parameters, currently nfs/exports (2.4)
> ide Directory containing info about the IDE subsystem
> interrupts Interrupt usage
> + syscalls Syscall count for each cpu
> iomem Memory map (2.4)
> ioports I/O port usage
> irq Masks for irq to cpu affinity (2.4)(smp?)
> @@ -1225,6 +1226,33 @@ Provides counts of softirq handlers serviced since boot time, for each CPU.
> HRTIMER: 0 0 0 0
> RCU: 1678 1769 2178 2250
>
> +syscalls
> +~~~~~~~~
> +
> +Provides counts of syscall since boot time, for each cpu.
> +
> +::
> +
> + > cat /proc/syscalls
> + CPU0 CPU1 CPU2 CPU3
> + 0: 3743 3099 3770 3242 sys_read
> + 1: 222 559 822 522 sys_write
> + 2: 0 0 0 0 sys_open
> + 3: 6481 18754 12077 7349 sys_close
> + 4: 11362 11120 11343 10665 sys_newstat
> + 5: 5224 13880 8578 5971 sys_newfstat
> + 6: 1228 1269 1459 1508 sys_newlstat
> + 7: 90 43 64 67 sys_poll
> + 8: 1635 1000 2071 1161 sys_lseek
> + .... omit the middle line ....
> + 441: 0 0 0 0 sys_epoll_pwait2
> + 442: 0 0 0 0 sys_mount_setattr
> + 443: 0 0 0 0 sys_quotactl_fd
> + 447: 0 0 0 0 sys_memfd_secret
> + 448: 0 0 0 0 sys_process_mrelease
> + 449: 0 0 0 0 sys_futex_waitv
> + 450: 0 0 0 0 sys_set_mempolicy_home_node

So for systems with large numbers of CPUs, these are huge lines? Have
you tested this on large systems? If so, how big?

thanks,

greg k-h