Re: [RFC next v2 0/5] ucount: add rlimit cache for ucount
From: Andrew Morton
Date: Fri May 09 2025 - 16:29:55 EST
On Fri, 9 May 2025 07:20:49 +0000 Chen Ridong <chenridong@xxxxxxxxxxxxxxx> wrote:
> The will-it-scale test case signal1 [1] has been observed. and the test
> results reveal that the signal sending system call lacks linearity.
> To further investigate this issue, we initiated a series of tests by
> launching varying numbers of dockers and closely monitored the throughput
> of each individual docker. The detailed test outcomes are presented as
> follows:
>
> | Dockers |1 |4 |8 |16 |32 |64 |
> | Throughput |380068 |353204 |308948 |306453 |180659 |129152 |
>
> The data clearly demonstrates a discernible trend: as the quantity of
> dockers increases, the throughput per container progressively declines.
> In-depth analysis has identified the root cause of this performance
> degradation. The ucouts module conducts statistics on rlimit, which
> involves a significant number of atomic operations. These atomic
> operations, when acting on the same variable, trigger a substantial number
> of cache misses or remote accesses, ultimately resulting in a drop in
> performance.
Did you consider simply turning that atomic_t counter into a
percpu_counter?