Re: [PATCH v2 0/2] fix MADV_COLLAPSE issue if THP settings are disabled
From: Usama Arif
Date: Fri Jun 13 2025 - 10:39:45 EST
On 13/06/2025 15:29, Lorenzo Stoakes wrote:
> On Fri, Jun 13, 2025 at 03:23:19PM +0100, Usama Arif wrote:
>>
>>
>> On 05/06/2025 09:00, Baolin Wang wrote:
>>> As we discussed in the previous thread [1], the MADV_COLLAPSE will ignore
>>> the system-wide anon/shmem THP sysfs settings, which means that even though
>>> we have disabled the anon/shmem THP configuration, MADV_COLLAPSE will still
>>> attempt to collapse into a anon/shmem THP. This violates the rule we have
>>> agreed upon: never means never. This patch set will address this issue.
>>
>> Hi Baolin,
>>
>> I know never means never, but I also thought that the per-size toggles had
>> priority over the system ones. This was discussed in [1] as well.
>>
>> My understanding with these patches is that if we have:
>>
>> [root@vm4 vmuser]# cat /sys/kernel/mm/transparent_hugepage/enabled
>> always madvise [never]
>> [root@vm4 vmuser]# cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
>> always inherit [madvise] never
>>
>> Than without these patches we get a hugepage when we do MADV_HUGEPAGE, but with
>> these we won't get a hugepage anymore eventhough hugepages-2048kB/enabled is set
>> to madvise?
>
> This isn't correct, madvise at a specific pagesize will still be permitted for
> MADV_COLLAPSE.
>
> In current contender for this patch:
>
> /* Strictly mask requested anonymous orders according to sysfs settings. */
> static inline unsigned long __thp_mask_anon_orders(unsigned long vm_flags,
> unsigned long tva_flags, unsigned long orders)
> {
> const unsigned long always = READ_ONCE(huge_anon_orders_always);
> const unsigned long madvise = READ_ONCE(huge_anon_orders_madvise);
> const unsigned long inherit = READ_ONCE(huge_anon_orders_inherit);;
> const unsigned long never = ~(always | madvise | inherit);
>
> Note that madvise is considered here.
>
Ah ok, Thanks for clearing that! I was reviewing the original patch in [1] but I
see this version in the replies.
I wish this function was simpler :) or maybe its me that takes so much time
to figure out if the order will be set or not by the end of the function.
[1] https://lore.kernel.org/all/8eefb0809c598fadaa4a022634fba5689a4f3257.1749109709.git.baolin.wang@xxxxxxxxxxxxxxxxx/
Thanks!
Usama