On 8/11/25 02:15, Uladzislau Rezki wrote:It can be easily identified as a bottleneck by multi-CPU stress testing programs involving frequent process creation and destruction, similar to the operation of a heavily loaded multi-process Apache web server. Hot/cold path ?
kernel_pte_work.list is global shared var, it would make the producerSorry for jumping.
pte_free_kernel() and the consumer kernel_pte_work_func() to operate in
serialized timing. In a large system, I don't think you design this
deliberately 🙂
Agree, unless it is never considered as a hot path or something that can
be really contented. It looks like you can use just a per-cpu llist to drain
thinks.
Remember, the code that has to run just before all this sent an IPI to
every single CPU on the system to have them do a (on x86 at least)
pretty expensive TLB flush.
If this is a hot path, we have bigger problems on our hands: the fullPerhaps not "WE", IPI driven TLB flush seems not the shared mechanism of
TLB flush on every CPU.
At least, please dont add bottleneck, how complex to do that ?
So, sure, there are a million ways to make this deferred freeing more
scalable. But the code that's here is dirt simple and self contained. If
someone has some ideas for something that's simpler and more scalable,
then I'm totally open to it.
But this is _not_ the place to add complexity to get scalability.