Re: 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849

From: Andrew Morton
Date: Tue Sep 20 2005 - 00:17:48 EST


Christoph Lameter <clameter@xxxxxxxxxxxx> wrote:
>
> On Mon, 19 Sep 2005, Andrew Morton wrote:
>
> > list_for_each(walk, &cache_chain) {
> > kmem_cache_t *searchp;
> > struct list_head* p;
> > int tofree;
> > struct slab *slabp;
> >
> > searchp = list_entry(walk, kmem_cache_t, next);
> >
> > if (searchp->flags & SLAB_NO_REAP)
> > goto next;
> >
> > check_irq_on();
> >
> > l3 = searchp->nodelists[numa_node_id()];
> > if (l3->alien)
> > drain_alien_cache(searchp, l3);
> > ->preempt here
> > spin_lock_irq(&l3->list_lock);
> >
> > drain_array_locked(searchp, ac_data(searchp), 0,
> > numa_node_id());
> > ->oops, wrong node.
>
> This is called from keventd which exists per processor. Hmmm... This looks
> as if it can change processors after all

Well no, it would be a big bug if a keventd thread were to change CPUs.

It's OK to rely upon the pinnedness of keventd I guess - a comment would be
nice.

> but the slab allocator depends on
> it running on the right processor. So does the page allocator. sigh. What
> is the point of having per processor workqueues if they do not stay on
> the assigned processor?

They do. I don't believe that preemption is the source of this BUG.
(Petr, does CONFIG_PREEMPT=n fix it?)

> The fast fix for this case is to get the node number once and then use it
> consistently.

If one is writing preempt-safe code then one should disable preemption
before copying the current CPU number into a local variable.

> But we really need to audit the slab and page allocator for
> additional cases like this or disable preempt and check for the right
> processor in cache_reap().

numa_node_id() must use smp_processor_id(), not raw_smp_processor_id().
Then all the runtime squawks need to be audited and fixed, or switched to
(new) raw_numa_node_id() if is is verified that a CPU/node switch at any
time is OK.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/