Re: Does perf-annotate work correctly?

From: Arnaldo Carvalho de Melo
Date: Tue Sep 12 2017 - 10:34:00 EST


Em Tue, Sep 12, 2017 at 06:10:35PM +0800, Du, Changbin escreveu:
> When a annotate a symbol, I find the annotated C source code doesn't match assembly code.
> So I cannot determine which line of C code has much overhead withou gdb's help.
>
> Here is a example result of function apic_has_interrupt_for_ppr() in kvm module.

Ok, was this using the module .ko file or /proc/kcore? You forgot to
cut'n'paste the first line on the screen.

Also, how did you use gdb?

perf uses objdump to do the disassembly, and depending on how it is used
(live system, post processing on a different machine, permissions) it
may use different files to do the disassembly.

Please provide more detailed information on the exact command line
arguments and usage scenario.

- Arnaldo

> â580 __clear_bit(KVM_APIC_PV_EOI_PENDING, &vcpu->arch.apic_attention); â
> â581 } â
> â â
> â583 static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr) â
> â584 { â
> 0.88 â30: cmpb $0x0,0x91(%rdi) â
> 2.54 â â je 63 â
> 0.20 â mov 0xa0(%rdi),%rcx â
> â581 int highest_irr; â
> â582 if (kvm_x86_ops->sync_pir_to_irr && apic->vcpu->arch.apicv_active) â
> 4.91 â mov $0xe0,%eax x â
> 1.46 â45: mov %eax,%edx x â
> 0.02 â sar $0x5,%edx x â
> 3.57 â shl $0x4,%edx x â
> 3.34 â movslq %edx,%rdx x â
> 1.25 â mov 0x200(%rcx,%rdx,1),%edx x â
> 42.44 â test %edx,%edx x â
> 0.01 â âââjne 88 x â
> 3.48 â â sub $0x20,%eax x â
> 2.24 â â cmp $0xffffffe0,%eax x â
> â586âapic_find_highest_irr(): â
> â â â
> â407â /* â
> â408â * Note that irr_pending is just a hint. It will be always â
> â409â * true with virtual interrupt delivery enabled. â
> â410â */ â
> â411â if (!apic->irr_pending) â
> â ââ jne 45 â
> 0.62 â63:â mov $0xffffffff,%eax â
> 0.83 â â leaveq â
> 13.52 â ââ retq â
> â6a:â mov %esi,-0x4(%rbp) â
> â â mov %rdx,%rdi â
> â418âfind_highest_vector(): â
> â340âstatic int find_highest_vector(void *bitmap) â
> â341â{ â
> â342â int vec; â
> â343â u32 *reg; â
> â â â
> â345â for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG; â
> â ââ callq *%rax â
> â â mov -0x4(%rbp),%esi â
> â343â vec >= 0; vec -= APIC_VECTORS_PER_REG) { â
> â344â reg = bitmap + REG_POS(vec); â
> â345â if (*reg) â
> 0.05 â75:â cmp $0xffffffff,%eax â
> â ââ je 63 â
> 1.95 â â mov %eax,%edx â
> 1.45 â â and $0xf0,%edx
>
>
> Look at the assembly code block where I have put a 'x' on the right. Apparently the
> assembly code doesn't match the C source code arrounded. Let's look the correct disassemble
> result from gdb:
>
> 340 for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
> 0x000000000003b4e0 <+64>: mov $0xe0,%eax
>
> 342 reg = bitmap + REG_POS(vec);
> 343 if (*reg)
> 0x000000000003b4e5 <+69>: mov %eax,%edx
> 0x000000000003b4e7 <+71>: sar $0x5,%edx
> 0x000000000003b4ea <+74>: shl $0x4,%edx
> 0x000000000003b4ed <+77>: movslq %edx,%rdx
> 0x000000000003b4f0 <+80>: mov 0x200(%rcx,%rdx,1),%edx
> 0x000000000003b4f7 <+87>: test %edx,%edx
> 0x000000000003b4f9 <+89>: jne 0x3b528 <apic_has_interrupt_for_ppr+136>
>
> 341 vec >= 0; vec -= APIC_VECTORS_PER_REG) {
> 0x000000000003b4fb <+91>: sub $0x20,%eax
>
> 340 for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
> 0x000000000003b4fe <+94>: cmp $0xffffffe0,%eax
> 0x000000000003b501 <+97>: jne 0x3b4e5 <apic_has_interrupt_for_ppr+69>
>
>
> Compared to gdb, perf-annoate has messed up. is it a bug or just perf is not as perfect as gdb?
>
> --
> Thanks,
> Changbin Du