Re: [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads

From: Dev Jain
Date: Fri Jun 27 2025 - 23:50:42 EST

Next message: pr-tracker-bot: "Re: [GIT PULL] smb3 client fixes"
Previous message: Hillf Danton: "Re: [syzbot] [usb?] WARNING in flush_delayed_work"
In reply to: Lorenzo Stoakes: "Re: [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads"
Next in thread: siddhartha: "Re: [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 27/06/25 9:00 pm, Lorenzo Stoakes wrote:

+cc Vlata

On Fri, Jun 27, 2025 at 04:09:16PM +0530, siddhartha@xxxxxxxx wrote:

Hi all,

I wanted to share validation data from a Hugging Face-based AI inferencing
workload,
which was significantly impacted by the THP alignment logic introduced in
commit efa7df3e3bb5.

Using transformer models with dynamic input lengths on Intel Xeon (Cooper
Lake),
we observed up to a 3200% throughput improvement after applying the patch
from Oct 2024:

mm: limit THP alignment of anonymous mappings to PMD-aligned sizes

All congratulations are owed to Vlastimil Babka for doing this, cc'd :)

I gather he enjoys novelty beer mugs as tokens of thanks ;)

I was wondering how the change can get us such a big optimization - the
alignment causes us to gain at most 1 extra PMD-THP mapping. Is there
something else I am missing?

I ask because when I was reading the code I was thinking whether a similar
change can be done for mTHPs.

Metrics:
- Model: BERT-base
- Inference engine: Transformers + ONNX Runtime
- Kernel: 6.6 vs patched 6.6.8
- Batch size: 8-32, input length: 64-512 tokens
- Metric: inference throughput (samples/sec)

Thanks for the fix -- this change had real impact on a production-relevant
workload.

Best Regards,
Siddhartha Sharma
ISV @ Kenip
Solution Link: https://www.intel.com/content/www/us/en/partner/showcase/offering/a5bHo00000045YUIAY/deadlock-clearance.html

Next message: pr-tracker-bot: "Re: [GIT PULL] smb3 client fixes"
Previous message: Hillf Danton: "Re: [syzbot] [usb?] WARNING in flush_delayed_work"
In reply to: Lorenzo Stoakes: "Re: [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads"
Next in thread: siddhartha: "Re: [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]