Re: [PATCH V2 1/4] vmalloc: introduce vfree_atomic()

From: Nick Piggin
Date: Tue Nov 18 2008 - 07:06:19 EST


On Tuesday 18 November 2008 22:38, Lai Jiangshan wrote:
> Nick Piggin wrote:
> > On Tuesday 18 November 2008 19:51, Lai Jiangshan wrote:
> >> fdtable and sysipc use vfree() in RCU callback. this patch
> >> introduce vfree_atomic() for them.
> >
> > AFAIKS, vfree is usable from atomic context? What am I missing?
>
> Hi, Nick Piggin,
>
> Sorry for misled you.
>
> fdtable and sysipc use vfree() in RCU callback.(_but defer it by
> schedule_work()_) current vfree() is not usable from atomic context, so
> this patches are worthy.

Hi,

I just wonder why vfree is not usable from atomic context? It is well
known that it can't be used in interrupt context, but just atomic
should work?


> even if vfree() is usable from atomic context soon,
> [PATCH 3/4] [PATCH 4/4] are still worthy now. Because these two patches are
> independent from vfree().(just needs to be changed one or two lines
> when vfree() is usable from atomic context)
>
> I suggest we can use vfree_atomic() before vfree() is available
> for atomic context. Because fdtable and sysipc need a grace way for
> using RCU and vmalloc/vfree. (actually, fdtable and sysipc have implemented
> they own "vfree_atomic()", but it's very ugly)

It's probably not a bad idea to consolidate these into one place.
Using vfree directly is not a trivial change, it could cause
regressions.


> Thanx, Lai.
>
> > Actually, one could argue that we don't want to perform such
> > costly operations in the atomic context, however with lazy
> > unmapping, vfree is very cheap now (amortized, at least).
>
> I'm looking forward to vfree() is available for atomic context.

Something like this. Actually, this is something we quite possibly
should be doing anyway, so that the expensive flush path can be
deferred to a less critical context.
Index: linux-2.6/mm/vmalloc.c
===================================================================
--- linux-2.6.orig/mm/vmalloc.c
+++ linux-2.6/mm/vmalloc.c
@@ -531,6 +531,17 @@ static void purge_vmap_area_lazy(void)
__purge_vmap_area_lazy(&start, &end, 0, 0);
}

+static void deferred_purge(struct work_struct *work)
+{
+ purge_vmap_area_lazy();
+}
+
+static struct work_struct purge_work;
+static void kick_purge_vmap_area_lazy(void)
+{
+ schedule_work(&purge_work);
+}
+
/*
* Free and unmap a vmap area
*/
@@ -539,7 +550,7 @@ static void free_unmap_vmap_area(struct
va->flags |= VM_LAZY_FREE;
atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr);
if (unlikely(atomic_read(&vmap_lazy_nr) > lazy_max_pages()))
- purge_vmap_area_lazy();
+ kick_purge_vmap_area_lazy();
}

static struct vmap_area *find_vmap_area(unsigned long addr)
@@ -938,6 +949,7 @@ void __init vmalloc_init(void)
{
int i;

+ INIT_WORK(&purge_work, deferred_purge);
for_each_possible_cpu(i) {
struct vmap_block_queue *vbq;