Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs

From: Thomas Gleixner
Date: Wed Jun 26 2019 - 19:11:41 EST


On Wed, 26 Jun 2019, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 9:31 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > I just checked two of them in the disassembly. In both cases it's jump
> > > label related. Here is one:
> > >
> > > asm volatile("1: rdmsr\n"
> > > 410: b9 59 02 00 00 mov $0x259,%ecx
> > > 415: 0f 32 rdmsr
> > > 417: 49 89 c6 mov %rax,%r14
> > > 41a: 48 89 d3 mov %rdx,%rbx
> > > return EAX_EDX_VAL(val, low, high);
> > > 41d: 48 c1 e3 20 shl $0x20,%rbx
> > > 421: 48 09 c3 or %rax,%rbx
> > > 424: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > > 429: eb 0f jmp 43a <get_fixed_ranges+0xaa>
> > > do_trace_read_msr(msr, val, 0);
> > > 42b: bf 59 02 00 00 mov $0x259,%edi <------- "unreachable"
>
> I assume if 0x42b is unreachable, that's bad as $0x259 is never stored
> in %edi before the call to get_fixed_ranges+0xaa...

Well no. The static key will never be enabled because it's not in the jump
table entries. And that's why objtool complains. That code path @42b will
never be reached even if the tracepoints are enabled because due to the
missing entry the kernel will not patch it.

> > So for some reason the .rela__jump_table are buggy on this clang build.
>
> So that sounds like a correctness bug then. (I'd been doing testing
> with the STATIC_KEYS_SELFTEST, which I guess doesn't expose this).
> I'm kind of surprised we can boot and pass STATIC_KEYS_SELFTEST. Any
> way you can help us pare down a test case?

Well, the test thing works as long as the entries which are used there are
correct. And looking at the output of that kernel build I did, I get 6
unreachable entries in 6 different files. That means that ~99% are
correct. So the chance that the self test fails is low.

Vs. test case. Just compile a kernel and pick the first file where objtool
complains. Look at the disassembly which will have the

nopl 0x0(%rax,%rax,1)

and that do_trace_read_msr() reference right at that failing offset (or
whatever other function is called in the file you pick).

>From there you should be able to debug why the compiler is not emitting the
r.rela__jump_table entry for this particular instance.

I compiled arch/x86/kernel/cpu/mtrr/generic.o several times and the failure
is fully reproducible.

Kernel version is plain v5.2-rc6 and the config I used is here:

https://tglx.de/~tglx/config-clang-repro

Make invocation is:

make CC=clang HOST_CC=clang arch/x86/kernel/cpu/mtrr/generic.o

that builds only that single file and not the whole kernel Moloch.

Output:

CC arch/x86/kernel/cpu/mtrr/generic.o
arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction

That's with the compiler I built a few hours ago with Nathans fixed
build-llvm.py script. Head commit of llvm-project is:

master 600941e34fe: Print NULL as "(null)" in diagnostic message

Hope that helps.

Thanks,

tglx