Re: [DISCUSSION] proposed mctl() API
From: Matthew Wilcox
Date: Tue Jun 10 2025 - 11:50:38 EST
On Tue, Jun 10, 2025 at 04:30:43PM +0100, Usama Arif wrote:
> If we have 2 workloads on the same server, For e.g. one is database where THPs
> just dont do well, but the other one is AI where THPs do really well. How
> will the kernel monitor that the database workload is performing worse
> and the AI one isnt?
It can monitor the allocation/access patterns and see who's getting
the benefit. The two workloads are in competition for memory, and
we can tell which pages are hot and which cold.
And I don't believe it's a binary anyway. I bet there are some
allocations where the database benefits from having THPs (I mean, I know
a database which invented the entire hugetlbfs subsystem so it could
use PMD entries and avoid one layer of TLB misses!)
> I added THP shrinker to hopefully try and do this automatically, and it does
> really help. But unfortunately it is not a complete solution.
> There are severely memory bound workloads where even a tiny increase
> in memory will lead to an OOM. And if you colocate the container thats running
> that workload with one in which we will benefit with THPs, we unfortunately
> can't just rely on the system doing the right thing.
Then maybe THP aren't for you. If your workloads are this sensitive,
perhaps you should be using a mechanism which gives you complete control
like hugetlbfs.