Re: [RFC PATCH] introduce sys_membarrier(): process-wide memorybarrier

From: Mathieu Desnoyers
Date: Thu Jan 07 2010 - 12:37:45 EST


* Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote:
> On Thu, 2010-01-07 at 08:52 -0800, Paul E. McKenney wrote:
> > On Thu, Jan 07, 2010 at 09:44:15AM +0100, Peter Zijlstra wrote:
> > > On Wed, 2010-01-06 at 22:35 -0800, Josh Triplett wrote:
> > > >
> > > > The number of threads doesn't matter nearly as much as the number of
> > > > threads typically running at a time compared to the number of
> > > > processors. Of course, we can't measure that as easily, but I don't
> > > > know that your proposed heuristic would approximate it well.
> > >
> > > Quite agreed, and not disturbing RT tasks is even more important.
> >
> > OK, so I stand un-Reviewed-by twice in one morning. ;-)
> >
> > > A simple:
> > >
> > > for_each_cpu(cpu, current->mm->cpu_vm_mask) {
> > > if (cpu_curr(cpu)->mm == current->mm)
> > > smp_call_function_single(cpu, func, NULL, 1);
> > > }
> > >
> > > seems far preferable over anything else, if you really want you can use
> > > a cpumask to copy cpu_vm_mask in and unset bits and use the mask with
> > > smp_call_function_any(), but that includes having to allocate the
> > > cpumask, which might or might not be too expensive for Mathieu.
> >
> > This would be vulnerable to the sys_membarrier() CPU seeing an old value
> > of cpu_curr(cpu)->mm, and that other task seeing the old value of the
> > pointer we are trying to RCU-destroy, right?
>
> Right, so I was thinking that since you want a mb to be executed when
> calling sys_membarrier(). If you observe a matching ->mm but the cpu has
> since scheduled, we're good since it scheduled (but we'll still send the
> IPI anyway), if we do not observe it because the task gets scheduled in
> after we do the iteration we're still good because it scheduled.

This deals with the case where the remote thread is being scheduled out.

As I understand it, if the thread is currently being scheduled in
exactly while we read cpu_curr(cpu)->mm, this means that we are
executing concurrently with the scheduler code. I expect that the
scheduler will issue a smp_mb() before leaving the cpu to the newcoming
thread, am I correct ? If this is correct, then we can assume that
reading any value of cpu_curr(cpu)->mm that does not match that of our
own process means that the value is either:

- corresponding to a thread belonging to another process (no ipi
needed).
- corresponding to any thread, being scheduled in/out -> no IPI needed,
but it does not hurt to send one. In this case, it does not matter if
the racy read even returns pure garbage, as it's really a "don't
care".

Even if the read would return garbage for some weird reason (piecewise
read maybe ?), that does not hurt, because the IPI is just "not needed"
at that point, since we rely on the concurrently executing scheduler to
issue the smp_mb().

>
> As to needing to keep rcu_read_lock() around the iteration, for sure we
> need that to ensure the remote task_struct reference we take is valid.
>

Indeed,

Thanks,

Mathieu


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/