Re: NUMA API for Linux

From: Rajesh Venkatasubramanian
Date: Thu Apr 08 2004 - 14:22:05 EST




On Thu, 8 Apr 2004, Hugh Dickins wrote:

> On Thu, 8 Apr 2004, Martin J. Bligh wrote:
> > > On Wed, 7 Apr 2004, Andrew Morton wrote:
> > >>
> > >> Your patch takes the CONFIG_NUMA vma from 64 bytes to 68. It would be nice
> > >> to pull those 4 bytes back somehow.
> > >
> > > How significant is this vma size issue?
> > >
> > > anon_vma objrmap will add 20 bytes to each vma (on 32-bit arches):
> > > 8 for prio_tree, 12 for anon_vma linkage in vma,
> > > sometimes another 12 for the anon_vma head itself.
> >
> > Ewwww. Isn't some of that shared most of the time though?
>
> The anon_vma head may well be shared with other vmas of the fork group.
> But the anon_vma linkage is a list_head and a pointer within the vma.
>
> prio_tree is already using a union as much as it can (and a pointer
> where a list_head would simplify the code); Rajesh was thinking of
> reusing vm_private_data for one pointer, but I've gone and used it
> for nonlinear swapout.

I guess using vm_private_data for nonlinear is not a problem because
we use list i_mmap_nonlinear for nonlinear vmas.

As you have found out vm_private_data is only used if vm_file != NULL
or VM_RESERVED or VM_DONTEXPAND is set. I think we can revert back to
i_mmap{_shared} list for such special cases and use prio_tree for
others. I maybe missing something. Please teach me.

If anonmm is merged then I plan to seriously consider removing that
8 extra bytes for prio_tree. If anon_vma is merged, then I can easily
point my finger at 12 more bytes added by anon_vma and be happy :)

I still think removing the 8 extra bytes used by prio_tree from
vm_area_struct is possible.

> > > anonmm objrmap adds just the 8 bytes for prio_tree,
> > > remaining overhead 28 bytes per mm.
> >
> > 28 bytes per *mm* is nothing, and I still think the prio_tree is
> > completely unneccesary. Nobody has ever demonstrated a real benchmark
> > that needs it, as far as I recall.
>
> I'm sure an Ingobench will shortly follow that observation.

Yeap. If Andrew didn't write his rmap-test.c and Ingo didn't write his
test-mmap3.c, I wouldn't even have considered developing prio_tree.

Thanks,
Rajesh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/