Semantics of smp_mb() [was : Re: [PATCH] Fix RCU race in access of nohz_cpu_mask ]

From: Srivatsa Vaddagiri
Date: Sun Dec 11 2005 - 12:47:14 EST

Next message: Randy.Dunlap: "Re: [BUG] Early Kernel Panic with 2.6.15-rc5"
Previous message: Ed Tomlinson: "Re: [patch -mm] fix SLOB on x64"
In reply to: Oleg Nesterov: "Re: [PATCH] Fix RCU race in access of nohz_cpu_mask"
Next in thread: Andrew James Wade: "Re: Semantics of smp_mb() [was : Re: [PATCH] Fix RCU race in access of nohz_cpu_mask ]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[Changed the subject line to be more generic in the interest of wider audience]

We seem to be having some confusion over the exact semantics of smp_mb().

Specifically, are all stores preceding smp_mb() guaranteed to have finished
(committed to memory/corresponding cache-lines on other CPUs invalidated)
*before* successive loads are issued?

>From IA-32 manual (download.intel.com/design/Pentium4/manuals/25366817.pdf
Page 271):

"Memory mapped devices and other I/O devices on the bus are often sensitive to
the order of writes to their I/O buffers. I/O instructions can be used to (the
IN and OUT instructions) impose strong write ordering on such accesses as
follows. Prior to executing an I/O instruction, the processor waits for all
___________________________
previous instructions in the program to complete and for all buffered
_____________________________________________________________________
writes to drain to memory. Only instruction fetch and page tables walks can
_________________________
pass I/O instructions. Execution of subsequent instructions do not begin until
the processor determines that the I/O instruction has been completed."

Synchronization mechanisms in multiple-processor systems may depend upon a
strong memory-ordering model. Here, a program can use a locking instruction
such as the XCHG instruction or the LOCK prefix to insure that a
read-modify-write operation on memory is carried out atomically. Locking
operations typically operate like I/O operations in that they wait for all
_________________
previous instructions to complete and for all buffered writes to drain to
_________________________________________________________________________
memory (see Section 7.1.2, âBus Lockingâ).
______

Program synchronization can also be carried out with serializing instructions
(see Section 7.4). These instructions are typically used at critical procedure
or task boundaries to force completion of all previous instructions before a
jump to a new section of code or a context switch occurs. Like the I/O and
locking instructions, the processor waits until all previous instructions have
________________________________________________________
been completed and all buffered writes have been drained to memory before
_________________________________________________________________________
executing the serializing instruction."
_____________________________________

>From this, looks like we can interpret that IA-32 processor will wait for all
writes to drain to memory (implies cache invalidation on other CPUs?) *before*
executing the synchronizing instruction?

Question is can this be generalized for other CPUs too?

On Sat, Dec 10, 2005 at 09:55:35PM +0300, Oleg Nesterov wrote:
> Srivatsa Vaddagiri wrote:
> >
> > On Fri, Dec 09, 2005 at 10:17:38PM +0300, Oleg Nesterov wrote:
> > > > rcp->cur++; /* New GP */
> > > >
> > > > smp_mb();
> > >
> > > I think I need some education on memory barriers.
> > >
> > > Does this mb() garantees that the new value of ->cur will be visible
> > > on other cpus immediately after smp_mb() (so that rcu_pending() will
> > > notice it) ?
> >
> > AFAIK the new value of ->cur should be visible to other CPUs when smp_mb()
> > returns. Here's a definition of smp_mb() from Paul's article:
> >
> > smp_mb(): "memory barrier" that orders both loads and stores. This means loads
> > and stores preceding the memory barrier are committed to memory before any
> > loads and stores following the memory barrier.
>
> Thanks, but this definition talks about ordering, it does not say
> anything about the time when stores are really commited to memory
> (and does it mean also that cache-lines are invalidated on other
> cpus ?).
>
> > [ http://www.linuxjournal.com/article/8211 ]
>
> And thanks for this link, now I have some understanding about
> read_barrier_depends() ...
>
> Oleg.

--

Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Randy.Dunlap: "Re: [BUG] Early Kernel Panic with 2.6.15-rc5"
Previous message: Ed Tomlinson: "Re: [patch -mm] fix SLOB on x64"
In reply to: Oleg Nesterov: "Re: [PATCH] Fix RCU race in access of nohz_cpu_mask"
Next in thread: Andrew James Wade: "Re: Semantics of smp_mb() [was : Re: [PATCH] Fix RCU race in access of nohz_cpu_mask ]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]