Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled

From: David Hildenbrand
Date: Wed Jun 25 2025 - 07:09:52 EST


On 25.06.25 13:03, Usama Arif wrote:


On 25/06/2025 08:34, David Hildenbrand wrote:

We would all prefer a less messy world of THP tunables.  I certainly
find plenty to dislike there too; and wish that a less assertive name
than "never" had been chosen originally for the default off position.

But please don't break the accepted and documented behaviour of
MADV_COLLAPSE now.

Again see above, I absolutely disagree this is documented _clearly_. And
that's the underlying issue here.
I feel like if you polled 100 system administrators (assuming they knew
about THP) as to how you globally disable THP, probably all 100 would say
you do it via:

# echo never > /sys/kernel/mm/transparent_hugepage/enabled


Yes. One big problem is that the documentation was not updated.

Changing the meaning of "entirely disabled" to "entirely disabled automatically (page faults, khugepaged)"

So shouldn't 'never break userspace' be based on practical reality rather
than a theorised interpretation of documents that sadly are not clear
enough?

I think the problem is that there might indeed be more users out there relying on "never+MADV_COLLPASE" to now place THPs than "never+MADV_COLLPASE" to no place THPs.

What is the harm when not placing THPs? Performance degradation for some apps?


I think a bigger issue than performance degradation is someone upgrading the kernel and not
seeing MADV_COLLAPSE working as it has since the beginning and not knowing that its due
to a kernel change.

I feel transparent_hugepage/enabled is too messed up, and its difficult to fix it without
breaking it for someone? I still find it weird that we can set transparent_hugepage/enabled
to never and transparent_hugepage/hugepages-2048kB/enabled to madvise and still get hugepages.
(And we actually use this configuration in production for our ARM servers).

Introducing deny for global and page size I feel will over complicate it because of the issue in
the previous paragraph, page size setting overrides global setting. so even if
transparent_hugepage/enabled is deny, we might still get a THP if the page setting is not.
Someone needs to file to deny, which is the same as setting every file to never.

So I just wanted to throw another bad idea in the mix, what if we introduce another sysfs file
(I hate introducing sysfs :)), something like /sys/kernel/mm/thp_allowed (or some other alternate name)
which is default 1.

Let's rather not :)

--
Cheers,

David / dhildenb