Re: 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849

From: Christoph Lameter
Date: Tue Sep 20 2005 - 00:06:33 EST


On Mon, 19 Sep 2005, Andrew Morton wrote:

> list_for_each(walk, &cache_chain) {
> kmem_cache_t *searchp;
> struct list_head* p;
> int tofree;
> struct slab *slabp;
>
> searchp = list_entry(walk, kmem_cache_t, next);
>
> if (searchp->flags & SLAB_NO_REAP)
> goto next;
>
> check_irq_on();
>
> l3 = searchp->nodelists[numa_node_id()];
> if (l3->alien)
> drain_alien_cache(searchp, l3);
> ->preempt here
> spin_lock_irq(&l3->list_lock);
>
> drain_array_locked(searchp, ac_data(searchp), 0,
> numa_node_id());
> ->oops, wrong node.

This is called from keventd which exists per processor. Hmmm... This looks
as if it can change processors after all but the slab allocator depends on
it running on the right processor. So does the page allocator. sigh. What
is the point of having per processor workqueues if they do not stay on
the assigned processor?

The fast fix for this case is to get the node number once and then use it
consistently. But we really need to audit the slab and page allocator for
additional cases like this or disable preempt and check for the right
processor in cache_reap().

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c 2005-09-19 14:10:33.489800899 -0700
+++ linux-2.6/mm/slab.c 2005-09-19 14:10:44.555105862 -0700
@@ -3262,6 +3262,7 @@
{
struct list_head *walk;
struct kmem_list3 *l3;
+ int node = numa_node_id();

if (down_trylock(&cache_chain_sem)) {
/* Give up. Setup the next iteration. */
@@ -3282,13 +3283,13 @@

check_irq_on();

- l3 = searchp->nodelists[numa_node_id()];
+ l3 = searchp->nodelists[node];
if (l3->alien)
drain_alien_cache(searchp, l3);
spin_lock_irq(&l3->list_lock);

drain_array_locked(searchp, ac_data(searchp), 0,
- numa_node_id());
+ node);

if (time_after(l3->next_reap, jiffies))
goto next_unlock;
@@ -3297,7 +3298,7 @@

if (l3->shared)
drain_array_locked(searchp, l3->shared, 0,
- numa_node_id());
+ node);

if (l3->free_touched) {
l3->free_touched = 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/