Re: per-cpu operation madness vs validation

From: Thomas Gleixner
Date: Wed Jul 27 2011 - 18:18:01 EST


On Wed, 27 Jul 2011, Christoph Lameter wrote:
> On Wed, 27 Jul 2011, Thomas Gleixner wrote:
>
> > > The key issue is that the -rt kernel has always had grave issues with
> > > performance when it comes to per cpu data access. Solving that by forcing
> > > the kernel to go slow it not the right approach.
> >
> > Nobody want's the kernel to go slow. All we want and we consider that
> > also a benefit for mainline is: proper annotation of the per cpu data
> > access, like we have for RCU and for locking.
>
> I love that stuff.
>
> > For -rt this lack of documentation and the lack of verification,
> > debugability and traceability is a major PITA, but that's true for
> > non-rt as well, just the PITA is gradually smaller and the bugs which
> > are there today are just extremly hard to trigger.
>
> Right. Sure wish there would be better checks. Or things would not have so
> many flavors.
>
> > And Peters idea of per_cpu_lock*() annotations will boil down to the
> > exact same thing which is there today when you compile the kernel w/o
> > lockdep enabled for per_cpu data correctness. We don't want to change
> > anything or impose any slowness, we just want a proper way to document
> > and verify that maze. That's really not too much of a request.
>
> No problem with that. The per cpu atomic ops are made to mostly stand on
> their own. However, the correctness is affected by placementin the per cpu
> sections that may start with get_cpu() or a preempt_disable().
>
> The reason that I switched from get_cpu() to preempt_disable() in some
> functions was because the this_cpu operations eliminated the need to pass
> the cpu number to a function. Thus the cpu variable is not needed anymore.
> Preempt_disable() doesnt provide it and so I thought that was the proper
> function to use.

Yes, you had no other choice, but that does not make them more obvious
in terms of debugability and verification.

We just have to understand, that preempt_disable, bh_disable,
irq_disable are cpu local BKLs with very subtle semantics which are
pretty close to the original BKL horror. All these mechanisms are per
cpu locks in fact and we need to get some annotation in place which
will help understandability and debugability in the first place. The
side effect that it will help RT to deal with that is - of course
desired from our side - but not the primary goal of that excercise.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/