Re: [PATCH tip/core/rcu 4/5] sys_membarrier: Add expedited option

From: Paul E. McKenney
Date: Tue Jul 25 2017 - 13:17:14 EST


On Tue, Jul 25, 2017 at 06:59:57PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 25, 2017 at 09:49:00AM -0700, Paul E. McKenney wrote:
> > On Tue, Jul 25, 2017 at 06:33:18PM +0200, Peter Zijlstra wrote:
> > > On Mon, Jul 24, 2017 at 02:58:16PM -0700, Paul E. McKenney wrote:
> > > > The sys_membarrier() system call has proven too slow for some use
> > > > cases, which has prompted users to instead rely on TLB shootdown.
> > > > Although TLB shootdown is much faster, it has the slight disadvantage
> > > > of not working at all on arm and arm64. This commit therefore adds
> > > > an expedited option to the sys_membarrier() system call.
> > >
> > > > @@ -64,6 +65,10 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> > > > if (num_online_cpus() > 1)
> > > > synchronize_sched();
> > > > return 0;
> > > > + case MEMBARRIER_CMD_SHARED_EXPEDITED:
> > > > + if (num_online_cpus() > 1)
> > > > + synchronize_sched_expedited();
> > > > + return 0;
> > >
> > > So you now give unprivileged userspace the means to IPI the entire
> > > machine?
> > >
> > > So what do we do when someone goes and does:
> > >
> > > for (;;)
> > > sys_membarrier(MEMBARRIER_CMD_SHARED_EXPEDITED, 0);
> > >
> > > on us?
> >
> > The same thing that happens when they call munmap().
>
> munmap() TLB invalidate is limited to those CPUs that actually ran
> threads of their process, while this is machine wide.

Or those CPUs running threads of any process mapping the underlying file
or whatever. And in either case, this can span the whole machine. Plus
there are a number of other ways for users to do on-demand full-system
IPIs, including any number of ways to wake up large numbers of CPUs,
including from unrelated processes.

But I do plan to add another alternative that is limited to threads of
the running process. I will be carrying both versions to enable those
who have been bugging me about this to do testing.

Thanx, Paul