Re: [PATCH rc 1/4] iommufd/selftest: Fix iommufd_dirty_tracking with large hugepage sizes
From: Jason Gunthorpe
Date: Tue Jun 17 2025 - 08:01:51 EST
On Mon, Jun 16, 2025 at 07:02:08PM -0700, Nicolin Chen wrote:
> On Mon, Jun 16, 2025 at 01:25:01PM -0300, Jason Gunthorpe wrote:
> > On Sun, Jun 15, 2025 at 10:02:03PM -0700, Nicolin Chen wrote:
> > > FIXTURE_TEARDOWN(iommufd_dirty_tracking)
> > > {
> > > - munmap(self->buffer, variant->buffer_size);
> > > - munmap(self->bitmap, DIV_ROUND_UP(self->bitmap_size, BITS_PER_BYTE));
> > > + unsigned long size = variant->buffer_size;
> > > +
> > > + if (variant->hugepages)
> > > + size = __ALIGN_KERNEL(variant->buffer_size, HUGEPAGE_SIZE);
> > > + munmap(self->buffer, size);
> > > + free(self->buffer);
> > > + free(self->bitmap);
> > > teardown_iommufd(self->fd, _metadata);
> >
> > munmap followed by free isn't right..
>
> You are right. I re-checked with Copilot. It says the same thing.
> I think the whole posix_memalign() + mmap() confuses me..
>
> Yet, should the bitmap pair with free() since it's allocated by a
> posix_memalign() call?
>
> > This code is using the glibc allocator to get a bunch of pages mmap'd
> > to an aligned location then replacing the pages with MAP_SHARED and
> > maybe HAP_HUGETLB versions.
>
> And I studied some use cases from Copilot. It says that, to use
> the combination of posix_memalign+mmap, we should do:
> aligned_ptr = posix_memalign(pagesize, pagesize);
> unmap(aligned_ptr, pagesize);
> mapped = mmap(aligned_ptr, pagesize, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
> munmap(mapped, pagesize);
> // No free() after munmap().
> ---breakdown---
> Before `posix_memalign()`:
> [ heap memory unused ]
>
> After `posix_memalign()`:
> [ posix_memalign() memory ] ← managed by malloc/free
> ↑ aligned_ptr
>
> After `munmap(aligned_ptr)`:
> [ unmapped memory ] ← allocator no longer owns it
Incorrect. The allocator has no idea about the munmap and munmap
doesn't disturb any of the allocator tracking structures.
> After `mmap(aligned_ptr, ..., MAP_FIXED)`:
> [ anonymous mmap region ] ← fully remapped, under your control
> ↑ mapped
> ---end---
No, this is wrong.
> It points out that the heap bookkeeping will be silently clobbered
> without the munmap() in-between (like we are doing):
Nope, doesn't work like that.
> ---breakdown---
> After `posix_memalign()`:
> [ posix_memalign() memory ] ← malloc thinks it owns this
>
> Then `mmap(aligned_ptr, ..., MAP_FIXED)`:
> [ anonymous mmap region ] ← malloc still thinks it owns this (!)
> ↑ mapped
> ---end---
Yes, this is correct and what we are doing here. The allocator always
owns it and we are just replacing the memory with a different mmap.
> It also gives a simpler solution for a memory that is not huge
> page backed but huge page aligned (our !variant->hugepage case):
> ---code---
> void *ptr;
> size_t alignment = 2 * 1024 * 1024; // or whatever HUGEPAGE_SIZE was
> size_t size = variant->buffer_size;
>
> // Step 1: Use posix_memalign to get an aligned pointer
> if (posix_memalign(&ptr, alignment, size) != 0) {
> perror("posix_memalign");
> return -1;
> }
Also no, the main point of this is to inject MAP_SHARED which
posix_memalign cannot not do.
> Also, for a huge page case, there is no need of posix_memalign():
> "Hugepages are not part of the standard heap, so allocator functions
> like posix_memalign() or malloc() don't help and can even get in the
> way."
> Instead, it suggests a cleaner version without posix_memalign():
> ---code---
> void *addr = mmap(NULL, variant->buffer_size, PROT_READ | PROT_WRITE,
> MAP_SHARED | MAP_ANONYMOUS | MAP_HUGETLB | MAP_POPULATE,
> -1, 0);
> if (addr == MAP_FAILED) { perror("mmap"); return -1; }
> ---end---
Yes, we could do this only for MAP_HUGETLB, but it doesn't help the
normal case with MAP_SHARED.
So I would leave it alone, use the version I showed.
Jason