Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs

From: Peter Zijlstra
Date: Thu Jun 27 2019 - 03:16:25 EST


On Wed, Jun 26, 2019 at 03:15:38PM -0700, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 2:24 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> >
> > > > > but it also makes objtool unhappy:
> > > > >
> > > > > arch/x86/events/intel/core.o: warning: objtool: intel_pmu_nhm_workaround()+0xb3: unreachable instruction
> > > > > kernel/fork.o: warning: objtool: free_thread_stack()+0x126: unreachable instruction
> > > > > mm/workingset.o: warning: objtool: count_shadow_nodes()+0x11f: unreachable instruction
> > > > > arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
> > > > > arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction
> > > > > drivers/iommu/irq_remapping.o: warning: objtool: irq_remap_enable_fault_handling()+0x1d: unreachable instruction
> >
> > > I just checked two of them in the disassembly. In both cases it's jump
> > > label related. Here is one:
> > >
> > > asm volatile("1: rdmsr\n"
> > > 410: b9 59 02 00 00 mov $0x259,%ecx
> > > 415: 0f 32 rdmsr
> > > 417: 49 89 c6 mov %rax,%r14
> > > 41a: 48 89 d3 mov %rdx,%rbx
> > > return EAX_EDX_VAL(val, low, high);
> > > 41d: 48 c1 e3 20 shl $0x20,%rbx
> > > 421: 48 09 c3 or %rax,%rbx
> > > 424: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > > 429: eb 0f jmp 43a <get_fixed_ranges+0xaa>
> > > do_trace_read_msr(msr, val, 0);
> > > 42b: bf 59 02 00 00 mov $0x259,%edi <------- "unreachable"
> > > 430: 48 89 de mov %rbx,%rsi
> > > 433: 31 d2 xor %edx,%edx
> > > 435: e8 00 00 00 00 callq 43a <get_fixed_ranges+0xaa>
> > > 43a: 44 89 35 00 00 00 00 mov %r14d,0x0(%rip) # 441 <get_fixed_ranges+0xb1>
> > >
> > > Interestingly enough there are some more hunks of the same pattern in that
> > > function which look all the same. Those are not upsetting objtool. Josh
> > > might give an hint where to stare at.
> >
> > That's pretty atrocious code-gen :/ Does LLVM support things like label
> > attributes? Back when we did jump labels GCC didn't, or rather, it
> > ignored it completely when combined with asm goto (and it might still).
> >
> > That is, would something like this:
> >
> > diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
> > index 06c3cc22a058..1761b1e76ddc 100644
> > --- a/arch/x86/include/asm/jump_label.h
> > +++ b/arch/x86/include/asm/jump_label.h
> > @@ -32,7 +32,7 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
> > : : "i" (key), "i" (branch) : : l_yes);
> >
> > return false;
> > -l_yes:
> > +l_yes: __attribute__((cold));
> > return true;
> > }
> >
> > @@ -49,7 +49,7 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
> > : : "i" (key), "i" (branch) : : l_yes);
> >
> > return false;
> > -l_yes:
> > +l_yes: __attribute__((hot));
> > return true;
> > }
> >
> > Help LLVM?

As I wrote later; the above suggestion is actually wrong :/

> So Clang definitely complains about putting attribute hot/cold on
> labels: https://godbolt.org/z/N-Z33Q
> In my test case I wasn't able to influence code gen with them though
> in GCC at -O2 or -O0. Maybe GCC has a test case that shows how they
> should work?

As I wrote in that same later email; the way we influence the actual
code-layout is with the __builtin_expect() thing. Let me expand on that
in another email.

Sadly, I've no clue what so ever about compiler internals, be it GCC or
LLVM.