futex performance regression from "futex: Allow automatic allocation of process wide futex hash"

From: Chris Mason
Date: Tue Jun 03 2025 - 15:03:05 EST

Next message: Joel Fernandes: "Re: [QUESTION] problems report: rcu_read_unlock_special() called in irq_exit() causes dead loop"
Previous message: Krzysztof Kozlowski: "Re: [PATCH v2 1/1] dt-bindings: gpio: convert gpio-74xx-mmio.txt to yaml format"
Next in thread: Sebastian Andrzej Siewior: "Re: futex performance regression from "futex: Allow automatic allocation of process wide futex hash""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi everyone,

While testing Peter's latest scheduler patches against current Linus
git, I found a pretty big performance regression with schbench:

https://github.com/masoncl/schbench

The command line I was using:

schbench -L -m 4 -M auto -t 256 -n 0 -r 60 -s 0

Bisecting the problem I landed on commit:

commit 7c4f75a21f636486d2969d9b6680403ea8483539 (HEAD -> update)
Author: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Date: Wed Apr 16 18:29:13 2025 +0200

futex: Allow automatic allocation of process wide futex hash

Allocate a private futex hash with 16 slots if a task forks its first
thread.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link:
https://lore.kernel.org/r/20250416162921.513656-14bigeasy@xxxxxxxxxxxxx

schbench uses one futex per thread, and the command line ends up
allocating 1024 threads, so the default bucket size used by this commit
is just too small. Using 2048 buckets makes the problem go away.

On my big turin system, this commit slows down RPS by 36%. But even a
VM on a skylake machine sees a 29% difference.

schbench is a microbenchmark, so grain of salt on all of this, but I
think our defaults are probably too low.

-chris

Next message: Joel Fernandes: "Re: [QUESTION] problems report: rcu_read_unlock_special() called in irq_exit() causes dead loop"
Previous message: Krzysztof Kozlowski: "Re: [PATCH v2 1/1] dt-bindings: gpio: convert gpio-74xx-mmio.txt to yaml format"
Next in thread: Sebastian Andrzej Siewior: "Re: futex performance regression from "futex: Allow automatic allocation of process wide futex hash""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]