Re: [uclinux-dist-devel] freezer: should barriers be smp ?

From: Mike Frysinger
Date: Wed Apr 13 2011 - 18:22:53 EST


On Wed, Apr 13, 2011 at 17:53, Rafael J. Wysocki wrote:
> On Wednesday, April 13, 2011, Mike Frysinger wrote:
>> On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote:
>> > On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
>> >> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
>> >> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
>> >> >> when we suspend/resume Blackfin SMP systems, we notice that the
>> >> >> freezer code runs on multiple cores. Âthis is of course what you want
>> >> >> -- freeze processes in parallel. Âhowever, the code only uses non-smp
>> >> >> based barriers which causes us problems ... our cores need software
>> >> >> support to keep caches in sync, so our smp barriers do just that. Âbut
>> >> >> the non-smp barriers do not, and so the frozen/thawed processes
>> >> >> randomly get stuck in the wrong task state.
>> >> >>
>> >> >> thinking about it, shouldnt the freezer code be using smp barriers ?
>> >> >
>> >> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
>> >> >
>> >> > Or do you mean something different?
>> >>
>> >> then what's the diff between smp_rmb() and rmb() ?
>> >>
>> >> this is what i'm proposing:
>> >> --- a/kernel/freezer.c
>> >> +++ b/kernel/freezer.c
>> >> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
>> >> Â{
>> >> Â Â if (!unlikely(current->flags & PF_NOFREEZE)) {
>> >> Â Â Â Â current->flags |= PF_FROZEN;
>> >> - Â Â Â wmb();
>> >> + Â Â Â smp_wmb();
>> >> Â Â }
>> >> Â Â clear_freeze_flag(current);
>> >> Â}
>> >> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
>> >> Â Â Â* the task as frozen and next clears its TIF_FREEZE.
>> >> Â Â Â*/
>> >> Â Â if (!freezing(p)) {
>> >> - Â Â Â rmb();
>> >> + Â Â Â smp_rmb();
>> >> Â Â Â Â if (frozen(p))
>> >> Â Â Â Â Â Â return false;
>> >
>> > smp_rmb() is NOP on uniprocessor.
>> >
>> > I believe the code is correct as is.
>>
>> that isnt what the code / documentation says. Âunless i'm reading them
>> wrong, both seem to indicate that the proposed patch is what we
>> actually want.
>
> Not really.
>
>> include/linux/compiler-gcc.h:
>> #define barrier() __asm__ __volatile__("": : :"memory")
>>
>> include/asm-generic/system.h:
>> #define mb() Â Âasm volatile ("": : :"memory")
>> #define rmb() Â mb()
>> #define wmb() Â asm volatile ("": : :"memory")
>>
>> #ifdef CONFIG_SMP
>> #define smp_mb() Â Âmb()
>> #define smp_rmb() Â rmb()
>> #define smp_wmb() Â wmb()
>> #else
>> #define smp_mb() Â Âbarrier()
>> #define smp_rmb() Â barrier()
>> #define smp_wmb() Â barrier()
>> #endif
>
> The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> which basically means that *mb() are more restrictive than the corresponding
> smp_*mb(). ÂMore precisely, they also cover the cases in which the CPU
> reorders instructions on uniprocessor, which we definitely want to cover.
>
> IOW, your patch would break things on uniprocessor where the CPU reorders
> instructions.
>
>> Documentation/memory-barriers.txt:
>> SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
>> systems because it is assumed that a CPU will appear to be self-consistent,
>> and will order overlapping accesses correctly with respect to itself.
>
> Exactly, which is not guaranteed in general (e.g. on Alpha). ÂThat is, some
> CPUs can reorder instructions in such a way that a compiler barrier is not
> sufficient to prevent breakage.
>
> The code _may_ be wrong for a different reason, though. ÂI need to check.

so the current code is protecting against a UP system swapping in/out
freezer threads for processes, and the barriers are to make sure that
the updated flags variable is posted by the time another swapped in
thread gets to that point.

i guess the trouble for us is that you have one CPU posting writes to
task->flags (and doing so by grabbing the task's spinlock), but the
other CPU is simply reading those flags. there are no SMP barriers in
between the read and write steps, nor is the reading CPU grabbing any
locks which would be an implicit SMP barrier. since the Blackfin SMP
port lacks hardware cache coherency, there is no way for us to know
"we've got to sync the caches before we can do this read". by using
the patch i posted above, we have that signal and so things work
correctly.,
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/