Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list

From: Chris Wilson
Date: Thu Apr 14 2016 - 09:49:42 EST


On Thu, Apr 14, 2016 at 03:13:26PM +0200, Roman Peniaev wrote:
> Hi, Chris.
>
> Is it made on purpose not to drop VM_LAZY_FREE flag in
> __purge_vmap_area_lazy()? With your patch va->flags
> will have two bits set: VM_LAZY_FREE | VM_LAZY_FREEING.
> Seems it is not that bad, because all other code paths
> do not care, but still the change is not clear.

Oh, that was just a bad deletion.

> Also, did you consider to avoid taking static purge_lock
> in __purge_vmap_area_lazy() ? Because, with your change
> it seems that you can avoid taking this lock at all.
> Just be careful when you observe llist as empty, i.e.
> nr == 0.

I admit I only briefly looked at the lock. I will be honest and say I
do not fully understand the requirements of the sync/force_flush
parameters.

purge_fragmented_blocks() manages per-cpu lists, so that looks safe
under its own rcu_read_lock.

Yes, it looks feasible to remove the purge_lock if we can relax sync.

> > @@ -706,6 +703,8 @@ static void purge_vmap_area_lazy(void)
> > static void free_vmap_area_noflush(struct vmap_area *va)
> > {
> > va->flags |= VM_LAZY_FREE;
> > + llist_add(&va->purge_list, &vmap_purge_list);
> > +
> > atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr);
>
> it seems to me that this a very long-standing problem: when you mark
> va->flags as VM_LAZY_FREE, va can be immediately freed from another CPU.
> If so, the line:
>
> atomic_add((va->va_end - va->va_start)....
>
> does use-after-free access.
>
> So I would also fix it with careful line reordering with barrier:
> (probably barrier is excess here, because llist_add implies cmpxchg,
> but I simply want to be explicit here, showing that marking va as
> VM_LAZY_FREE and adding it to the list should be at the end)
>
> - va->flags |= VM_LAZY_FREE;
> atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr);
> + smp_mb__after_atomic();
> + va->flags |= VM_LAZY_FREE;
> + llist_add(&va->purge_list, &vmap_purge_list);
>
> What do you think?

Yup, it is racy. We can drop the modification of LAZY_FREE/LAZY_FREEING
to ease one headache, since those bits are not inspected anywhere afaict.
Would not using atomic_add_return() be even clearer with respect to
ordering:

nr_lazy = atomic_add_return((va->va_end - va->va_start) >> PAGE_SHIFT,
&vmap_lazy_nr);
llist_add(&va->purge_list, &vmap_purge_list);

if (unlikely(nr_lazy > lazy_max_pages()))
try_purge_vmap_area_lazy();

Since it doesn't matter that much if we make an extra call to
try_purge_vmap_area_lazy() when we are on the boundary.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre