Re: [PATCH v2] mm: per-thread vma caching

From: Davidlohr Bueso
Date: Tue Feb 25 2014 - 14:30:16 EST


On Tue, 2014-02-25 at 11:04 -0800, Davidlohr Bueso wrote:
> On Tue, 2014-02-25 at 10:37 -0800, Linus Torvalds wrote:
> > On Tue, Feb 25, 2014 at 10:16 AM, Davidlohr Bueso <davidlohr@xxxxxx> wrote:
> > > index a17621c..14396bf 100644
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -363,7 +363,12 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
> > >
> > > mm->locked_vm = 0;
> > > mm->mmap = NULL;
> > > - mm->mmap_cache = NULL;
> > > + mm->vmacache_seqnum = oldmm->vmacache_seqnum + 1;
> > > +
> > > + /* deal with overflows */
> > > + if (unlikely(mm->vmacache_seqnum == 0))
> > > + vmacache_invalidate_all();
> >
> > Correct me if I'm wrong, but this can not possibly be correct.
> >
> > vmacache_invalidate_all() walks over all the threads of the current
> > process, but "mm" here is the mm of the *new* process that is getting
> > created, and is unrelated in all ways to the threads of the old
> > process.
>
> vmacache_invalidate_all() is actually a misleading name since we really
> aren't invalidating but just clearing the cache. I'll rename it.
> Anyways...
>
> > So it walks completely the wrong list of threads.
>
> But we still need to deal with the rest of the tasks in the system, so
> anytime there's an overflow we need to nullify all cached vmas, not just
> current's. Am I missing something special about fork?
>
> > In fact, the sequence number of the old vm and the sequence number of
> > the new vm cannot in any way be related.
> >
> > As far as I can tell, the only sane thing to do at fork/clone() time is to:
> >
> > - clear all the cache entries (of the new 'struct task_struct'! - so
> > not in dup_mmap, but make sure it's zeroed when allocating!)(
>
> Right, but that's done upon the first lookup, when vmacache_valid() is
> false.
>
> > - set vmcache_seqnum to 0 in dup_mmap (since any sequence number is
> > fine when it got invalidated, and 0 is best for "avoid overflow").
>
> Assuming your referring to curr->vmacache_seqnum (since mm's is already
> set).. isn't it irrelevant since we set it anyways when the first lookup
> fails?

Never mind, I see your referring to the mm seqnum. Sounds like it's an
interesting alternative to the CONFIG_MMU workaround. I will look into
it.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/