Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

From: Linus Torvalds
Date: Tue Nov 14 2017 - 12:31:38 EST

Next message: Yang Shi: "Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg"
Previous message: Colin King: "[PATCH] ALSA: synth: emux: remove redundant test for r <= 13"
In reply to: Mathieu Desnoyers: "Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration"
Next in thread: Andy Lutomirski: "Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Nov 14, 2017 at 9:10 AM, Mathieu Desnoyers
<mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>> (* OPTION 1 *)
>> Store modified code (as data) into code segment;
>> Jump to new code or an intermediate location;
>> Execute new code;"
>
> Good point, so this is likely why I was having trouble reproducing the
> single-threaded self-modifying code incoherent case. I did have a branch
> in there.

Actually, even *without* the branch, Intel has been very good at
having precise I$ coherency. I think uou can literally store to the
next instruction, and Intel CPU's after the Pentium Pro would notice,
take a micro-fault, and handle it correctly (the i486 and Pentium did
not have that level of coherency, but a taken branch would flush the
fetch buffer).

An in-order Atom probabably has the old Pentium behavior, and you
could see it there.

But starting with the P6, and OoO execution, the "taken branch" thing
meant very little, so Intel started instead just doing the
"store-vs-instruction fetch" coherency explicitly, which causes it to
be precise.

Afaik, the only way to show incoherent I$ fairly easily is to use
virtual aliasing, and store to a different virtual address, because
the fetch buffer coherency is done by virtual address.

But even then, it's only the fetch buffer (and it's been called
different things over the years, now it's a uop loop cache), not the
L1 caches, so you get a very limited window of instructions.

And that fetch buffer is also where any cross-cpu incoherency would
be, for the exact same reason.

Linus

Next message: Yang Shi: "Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg"
Previous message: Colin King: "[PATCH] ALSA: synth: emux: remove redundant test for r <= 13"
In reply to: Mathieu Desnoyers: "Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration"
Next in thread: Andy Lutomirski: "Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]