Re: [PATCH RFC] percpu: add data dependency barrier in percpu accessors and operations

From: Rusty Russell
Date: Tue Jul 08 2014 - 21:13:22 EST


Tejun Heo <tj@xxxxxxxxxx> writes:
> Hello, Paul.

Rusty wakes up...

>> Good point. How about per-CPU variables that are introduced by
>> loadable modules? (I would guess that there are plenty of memory
>> barriers in the load process, given that text and data also needs
>> to be visible to other CPUs.)
>
> (cc'ing Rusty, hi!)
>
> Percpu initialization happens in post_relocation() before
> module_finalize(). There seem to be enough operations which can act
> as write barrier afterwards but nothing seems explicit.
>
> I have no idea how we're guaranteeing that .data is visible to all
> cpus without barrier from reader side. Maybe we don't allow something
> like the following?
>
> module init built-in code
>
> static int mod_static_var = X; if (builtin_ptr)
> builtin_ptr = &mod_static_var; WARN_ON(*builtin_ptr != X);
>
> Rusty, can you please enlighten me?

Subtle, but I think in theory (though not in practice) this can happen.

Making this this assigner's responsibility is nasty, since we reasonably
assume that .data is consistent across CPUs once code is executing
(similarly on boot).

>> Again, it won't help for the allocator to strongly order the
>> initialization to zero if there are additional initializations of some
>> fields to non-zero values. And again, it should be a lot easier to
>> require the smp_store_release() or whatever uniformly than only in cases
>> where additional initialization occurred.
>
> This one is less murky as we can say that the cpu which allocated owns
> the zeroing; however, it still deviates from requiring the one which
> makes changes to take care of barriering for those changes, which is
> what makes me feel a bit uneasy. IOW, it's the allocator which
> cleared the memory, why should its users worry about in-flight
> operations from it? That said, this poses a lot less issues compared
> to percpu ones as passing normal pointers to other cpus w/o going
> through proper set of barriers is a special thing to do anyway.

I think that the implicit per-cpu allocations done by modules need to
be consistent once the module is running.

I'm deeply reluctant to advocate it in the other per-cpu cases though.
Once we add a barrier, it's impossible to remove: callers may subtly
rely on the behavior.

"Magic barrier sprinkles" is a bad path to start down, IMHO.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/