Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation

From: Linus Torvalds
Date: Fri Dec 02 2016 - 18:10:11 EST


On Fri, Dec 2, 2016 at 2:55 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>
>> Honestly, I think Intel should clean up their documentation.
>
> I'm not sure I follow. If a user program gets migrated, it might end
> up doing cross-modification when it expects self-modification. If
> that trips the program up, is that a user bug or a kernel bug?

Well, the user may not see it as a cross-modification.

Example: user compiles a program, and writes out the new binary. That
write goes to the page cache.

The user then immediately executes that program.

It's technically a "cross modification", because the build that wrote
the page cache ran on one CPU, and then it gets loaded on another.

Not, page faulting the binary does bring in a known serializing
instruction: iret.

But let's theorize that we have your "optimistic sysret" return path
because sometimes it can happen. So the "iret" isn't exactly
fundamental.

But we know we will write %cr2, which is a serializing instruction.

But that's not fundamental either, because we <i>could</i> just have a
program just load the object file into its own address space using the
dynamic linker. And if you don't unmap anything, there won't be any
TLB flushes.

Now, that is safe <i>too</I>, but by then we're not even relying on
simply the fact that the code couldn't even have been in any virtual
caches in the running environment, so it _must_ have come from the
physically indexed data cache. So no actual serializing instruction
even _needed_.

So there is no room for any cache I$ coherency issue at any point, but
note how we got to the point where we're now basically depending on
some fairly fundamental logic that is not in the Intel documentation?

THAT is what I don't like. I don't doubt for a moment that what we're
doing is entirely coherent, and we're fine. But the intel memory
ordering documentation simply doesn't cover this situation at all. The
"real" memory ordering documentation only covers the normal data
cache. And then they handwave the "self-modifying code" situation with
incomplete examples and just bullshit "you need a serializing
instruction", which clearly isn't actually the case, and is also
something that we very much don't even do.

It would be better if here was actual documentation, and we had some
nice clear "yeah, we don't need any stinking serializing instructions,
because we're already doing X".

Linus