Re: Regression from 2.6.36

From: AmÃrico Wang
Date: Thu Apr 07 2011 - 07:21:30 EST


On Thu, Apr 7, 2011 at 6:19 PM, Jiri Slaby <jslaby@xxxxxxx> wrote:
> Cced few people.
>
> Also the series which introduced this were discussed at:
> http://lkml.org/lkml/2010/5/3/53
>

I guess this is due to that lots of fdt are allocated by kmalloc(),
not vmalloc(), and we kfree() them in rcu callback.

How about deferring all of the removal to workqueue? This may
hurt performance I think.

Anyway, like the patch below... makes sense?

Not-yet-signed-off-by: WANG Cong <xiyou.wangcong@xxxxxxxxx>

---
diff --git a/fs/file.c b/fs/file.c
index 0be3447..34dc355 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -96,20 +96,14 @@ void free_fdtable_rcu(struct rcu_head *rcu)
container_of(fdt, struct files_struct, fdtab));
return;
}
- if (!is_vmalloc_addr(fdt->fd) && !is_vmalloc_addr(fdt->open_fds)) {
- kfree(fdt->fd);
- kfree(fdt->open_fds);
- kfree(fdt);
- } else {
- fddef = &get_cpu_var(fdtable_defer_list);
- spin_lock(&fddef->lock);
- fdt->next = fddef->next;
- fddef->next = fdt;
- /* vmallocs are handled from the workqueue context */
- schedule_work(&fddef->wq);
- spin_unlock(&fddef->lock);
- put_cpu_var(fdtable_defer_list);
- }
+
+ fddef = &get_cpu_var(fdtable_defer_list);
+ spin_lock(&fddef->lock);
+ fdt->next = fddef->next;
+ fddef->next = fdt;
+ schedule_work(&fddef->wq);
+ spin_unlock(&fddef->lock);
+ put_cpu_var(fdtable_defer_list);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/