Re: RFC: COW for hugepages

From: David Gibson
Date: Thu Apr 08 2004 - 09:23:54 EST


On Thu, Apr 08, 2004 at 01:10:51PM +0200, Zoltan Menyhart wrote:
> David Gibson wrote:
>
> > > Why not just add a flag for a VMA telling if you want / do not want to
> > > copy it on "fork()" ? E.g.:
> > >
> > > dup_mmap():
> > >
> > > for (mpnt = current->mm->mmap ; mpnt ; mpnt = mpnt->vm_next) {
> > >
> > > if (mpnt->vm_flags & VM_HUGETLB_DONT_COPY)
> > > <do nothing>
> > > }
> > >
> >
> > Um.. why would that be useful?
>
> I think there are 2 major cases:
>
> - A big hugepage-using program creates threads to take advantage
> of all of the CPUs => use clone2(...CLONEVM...), it works today.
> - Another big hugepage-using program calling a little shell function
> with system() => just skip the VMA of the huge page area in
> do_fork():
> copy_process():
> copy_mm():
> dup_mmap()
> The child will have no copy of the huge page area. No problem, it will
> exec() soon, and the stack, the usual data, the malloc()'ed data, etc.
> are not in the huge page area => exec() will will work correctly.

> I do not think we need a COW of the huge pages.

Both of these two cases work already. The fork() will copy the huge
PTEs, but that's no big deal since all the actual pages are shared.
COW isn't supposed to help either case - it's for cases where we
really do need MAP_PRIVAT semantics. That's particularly important
for things we might do in the future where large pages are allocated
automatically according to certain heuristics. In this case we
certainly can't make the pages silently have different semantics to
normal anonymous memory.

--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/