Re: [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads

From: Lorenzo Stoakes
Date: Tue Jul 01 2025 - 02:50:30 EST


On Tue, Jul 01, 2025 at 12:00:21PM +0530, Dev Jain wrote:
>
> On 01/07/25 11:23 am, Lorenzo Stoakes wrote:
> > On Tue, Jul 01, 2025 at 11:15:25AM +0530, Dev Jain wrote:
> > > Sorry I am not following, don't know in detail about the VMA merge stuff.
> > > Are you saying the after the patch, the VMAs will eventually get merged?
> > > Is it possible in the kernel to get a merge in the "future"; as I understand
> > > it only happens at mmap() time?
> > >
> > > Suppose before the patch, you have two consecutive VMAs between (PMD, 2*PMD) size.
> > > If they are able to get merged after the patch, why won't they be merged before the patch,
> > > since the VMA characteristics are the same?
> > >
> > >
> > Rik's patch aligned each to 2 MiB boundary. So you'd get gaps:
> >
> >
> > 0 2MB 4MB 6MB 8MB 10MB
> > |-------------.------| |-------------.------| |-------------.------|
> > | . | | . | | . |
> > | . | | . | | . |
> > |-------------.------| |-------------.------| |-------------.------|
> > huge mapped 4k m'd
>
> The effort to draw this is appreciated!
>
> I understood the alignment, what I am asking is this:
>
> In __get_unmapped_area(), we will return a THP-aligned addr from
> thp_get_unmapped_area_vmflags(). Now for the diagram you have
> drawn, suppose that before the patch, we first mmap() the
> 8MB-start chunk. Then we mmap the 4MB start chunk.
> We go to __mmap_region(), and we see that the 8MB-start chunk
> has mergeable characteristics, so we merge. So the gap goes away?

No because there's a gap, we only merge immedaitely adjacent VMAs. And obviously
gaps mean page tables wouldn't be adjacent either...

The get_unmmaped_area() would have otherwise given adjacent mappings. Vlasta's
patch means in this case we no longer bother trying to align these because their
_length_ isn't PMD aligned.