Re: [patch] Real-Time Preemption, -RT-2.6.10-rc2-mm3-V0.7.32-6

From: Steven Rostedt
Date: Wed Dec 08 2004 - 14:05:14 EST


On Wed, 2004-12-08 at 18:14 +0000, Rui Nuno Capela wrote:
> Steven Rostedt wrote:
> >
> > I found a race condition in slab.c, but I'm still trying to figure out
> > exactly how it's playing out. This has to do with dynamic loading and
> > unloading of caches. I have a small test case that simulates the problem
> > at http://home.stny.rr.com/rostedt/tests/sillycaches.tgz
> >
> > This was done on:
> >
> > # uname -r
> > 2.6.10-rc2-mm3-V0.7.32-9
> >

<snip>


Found the culprit!!! I did a diff of 2.6.10-rc2-mm3 to
2.6.10-rc2-mm3-V0.7.32-9 and found this in slab.c:
----------------------------
+#ifndef CONFIG_PREEMPT_RT
+/*
+ * Executes in an IRQ context:
+ */
static void do_drain(void *arg)
{ kmem_cache_t *cachep = (kmem_cache_t*)arg;
struct array_cache *ac;
+ int cpu = smp_processor_id();
check_irq_off();
- ac = ac_data(cachep);
+ ac = ac_data(cachep, cpu);
spin_lock(&cachep->spinlock);
free_block(cachep, &ac_entry(ac)[0], ac->avail);
spin_unlock(&cachep->spinlock);
ac->avail = 0;
}
+#endif

static void drain_cpu_caches(kmem_cache_t *cachep)
{
+#ifndef CONFIG_PREEMPT_RT
smp_call_function_all_cpus(do_drain, cachep);
+#endif
check_irq_on();

--------------------------------
(I have CONFIG_PREEMPT_RT defined :-)

I then put in

static void drain_cpu_caches(kmem_cache_t *cachep)
{
#ifndef CONFIG_PREEMPT_RT
smp_call_function_all_cpus(do_drain, cachep);
#endif
check_irq_on();
spin_lock_irq(&cachep->spinlock);
+ {
+ struct array_cache *ac;
+ ac = ac_data(cachep, smp_processor_id());
+ free_block(cachep, &ac_entry(ac)[0], ac->avail);
+ ac->avail = 0;
+ }

To see what would happen, and this indeed fixed the problem. At least
didn't cause the problem to appear after a few tests.

Obviously, this is not the right answer, and Ingo, since I don't know
exactly what you are accomplishing with the added cpu changes, I think
you are probably better at writing a patch than I.

Which brings up another point.

In places like kmem_cache_create you have cpu = _smp_processor_id(); and
way down near the bottom, you use it. Being a preemptable kernel, can't
that process jump cpus in the time being? So isn't that in itself a race
condition?

Thanks,

-- Steve

Rui,

Try adding the following in slab.c

--- slab.c 2004-12-08 09:27:10.000000000 -0500
+++ slab.c.new 2004-12-08 13:58:40.000000000 -0500
@@ -1533,6 +1533,12 @@
#ifndef CONFIG_PREEMPT_RT
smp_call_function_all_cpus(do_drain, cachep);
#endif
+ {
+ struct array_cache *ac;
+ ac = ac_data(cachep, smp_processor_id());
+ free_block(cachep, &ac_entry(ac)[0], ac->avail);
+ ac->avail = 0;
+ }
check_irq_on();
spin_lock_irq(&cachep->spinlock);
if (cachep->lists.shared)


and see if this fixes your usb problems. I would say that this is not a
proper fix and especially for a SMP system. But if it fixes your problem
then you know this is the solution.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/