Re: [PATCH] trace: adjust code layout in get_recursion_context

From: Jesper Dangaard Brouer
Date: Tue Aug 22 2017 - 13:00:53 EST


On Tue, 22 Aug 2017 17:20:25 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Tue, Aug 22, 2017 at 05:14:10PM +0200, Peter Zijlstra wrote:
> > On Tue, Aug 22, 2017 at 04:40:24PM +0200, Jesper Dangaard Brouer wrote:
> > > In an XDP redirect applications using tracepoint xdp:xdp_redirect to
> > > diagnose TX overrun, I noticed perf_swevent_get_recursion_context()
> > > was consuming 2% CPU. This was reduced to 1.6% with this simple
> > > change.
> >
> > It is also incorrect. What do you suppose it now returns when the NMI
> > hits a hard IRQ which hit during a Soft IRQ?
>
> Does this help any? I can imagine the compiler could struggle to CSE
> preempt_count() seeing how its an asm thing.

Nope, it does not help (see assembly below, with perf percentages).

But I think I can achieve that I want by a simple unlikely(in_nmi()) annotation.

> ---
> kernel/events/internal.h | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/events/internal.h b/kernel/events/internal.h
> index 486fd78eb8d5..e0b5b8fa83a2 100644
> --- a/kernel/events/internal.h
> +++ b/kernel/events/internal.h
> @@ -206,13 +206,14 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs);
>
> static inline int get_recursion_context(int *recursion)
> {
> + unsigned int pc = preempt_count();
> int rctx;
>
> - if (in_nmi())
> + if (pc & NMI_MASK)
> rctx = 3;
> - else if (in_irq())
> + else if (pc & HARDIRQ_MASK)
> rctx = 2;
> - else if (in_softirq())
> + else if (pc & SOFTIRQ_OFFSET)

Hmmm... shouldn't this be SOFTIRQ_MASK?

> rctx = 1;
> else
> rctx = 0;

perf_swevent_get_recursion_context /proc/kcore
â
â
â Disassembly of section load0:
â
â ffffffff811465c0 <load0>:
13.32 â push %rbp
1.43 â mov $0x14d20,%rax
5.12 â mov %rsp,%rbp
6.56 â add %gs:0x7eec3b5d(%rip),%rax
0.72 â lea 0x34(%rax),%rdx
0.31 â mov %gs:0x7eec5db2(%rip),%eax
2.46 â mov %eax,%ecx
6.86 â and $0x7fffffff,%ecx
0.72 â test $0x100000,%eax
â â jne 40
â test $0xf0000,%eax
0.41 â â je 5b
â mov $0x8,%ecx
â mov $0x2,%eax
â â jmp 4a
â40: mov $0xc,%ecx
â mov $0x3,%eax
2.05 â4a: add %rcx,%rdx
16.60 â mov (%rdx),%ecx
2.66 â test %ecx,%ecx
â â jne 6d
1.33 â movl $0x1,(%rdx)
1.54 â pop %rbp
4.51 â â retq
3.89 â5b: shr $0x8,%ecx
9.53 â and $0x1,%ecx
0.61 â movzbl %cl,%eax
0.92 â movzbl %cl,%ecx
4.30 â shl $0x2,%rcx
14.14 â â jmp 4a
â6d: mov $0xffffffff,%eax
â pop %rbp
â â retq
â xchg %ax,%ax



--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer