Re: [BUG/RFC] perf test fails on AMD CPUs

From: sherry hurwitz
Date: Mon Aug 24 2015 - 21:12:03 EST




On 08/18/2015 05:10 AM, Jiri Olsa wrote:
On Mon, Aug 17, 2015 at 09:06:59AM -0700, Andy Lutomirski wrote:
On Sun, Aug 16, 2015 at 9:36 PM, Borislav Petkov <bp@xxxxxxx> wrote:
On Mon, Aug 17, 2015 at 12:29:56AM +0200, Jiri Olsa wrote:
hi,
'perf test 18' is failing on systems with AMD processor.
Hmm, still using that b0rked test box? :-)

Also, which kernel?

There have been substantial changes to the entry code recently. Although
I don't see anything being done differently on AMD there except
X86_BUG_SYSRET_SS_ATTRS but that should be unrelated.

The only reason I could find is that AMD does not set 'resume flag'
in RFLAGS register the way the Intel CPU does.

(simplified) test scenario:

- create breakpoint (on test_function) perf event with SIGIO signal
to be delivered any time the breakpoint is hit
- run test_function


expected course of actions is:
1) CPU hits 'test_function'
2) DB exception is triggered, with RFLAGS.RF=0
3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
triggers irq_work pending work
4) DB exception executes iretd
5) irq_work interrupt is triggered, with RFLAGS.RF=1
6) irq_work interrupt calls kill_fasync with SIGIO signal
7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
which actually delivers the SIGIO signal
8) sigreturn syscall prepare registers to return to the
instruction from step 1) and sets RFLAGS.RF to the its original
value from step 5) (RFLAGS.RF=1)
9) CPU hits 'test_function' and DB exception is NOT triggered
due to RFLAGS.RF=1

this is how I see it works on Intel

But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
trigger the DB exception once again and makes the test fail.
Adding Andy, he might have an idea. Leaving in the rest for reference.
Gee thanks :-p

Jiri, did you instrument the code and observe do_IRQ sees RF clear in
its pt_regs? Also, it might be worth checking that regs->ip in the
irq_work matches regs->ip.
yep, thats what I saw.. once irq_work interrupt was triggered
the regs->ip was same as for the previous debug exception
but the RFLAGS.RF was 0

It's *possible* that I messed up and broke RF restore with
opportunistic sysret, but the code looks correct:

testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
jnz opportunistic_sysret_failed
AFAICS the problematic paths did not hit syscalls

buuuuuut anyway, it looks like latest AMD firmware issue:

[root@amd-pike-07 ~]# cat /sys/devices/system/cpu/cpu0/microcode/version
0x6000822
[root@amd-pike-07 perf]# ./perf test 18
18: Test breakpoint overflow signal handler : Ok

[root@amd-pike-07 perf]# cat /sys/devices/system/cpu/cpu0/microcode/version
0x6000832
[root@amd-pike-07 perf]# ./perf test 18
18: Test breakpoint overflow signal handler : FAILED!


[root@amd-pike-07 ~]# cat /proc/cpuinfo
processor : 7
vendor_id : AuthenticAMD
cpu family : 21
model : 2
model name : AMD Opteron(tm) Processor 3380
stepping : 0
microcode : 0x6000832

SNIP


AMD description of RF flag (SDM 3.1.6):
=======================================
Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an
instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug
exceptions from occurring on the same instruction.
The processor clears the RF bit after every instruction is successfully executed, except when the
instruction is:
â
â
An IRET that sets the RF bit.
JMP, CALL, or INTn through a task gate.
In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
When an exception occurs (or when a string instruction is interrupted), the processor normally sets
RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
image.
That's a little weird, I think. Shouldn't RF be zero on #DB due to a
*watchpoint* so that a watchpoint followed immediately by a breakpoint
works?
the AMD description looked to be more vague (compared to Intels)

â For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
called. This includes:
â Debug exceptions generated in response to instruction breakpoints
â Hardware-generated interrupts arriving between instructions (including those arriving after the last
iteration of a repeated string instruction)
This appears to be why it works on Intel. Does AMD not do that? We
could probably work around this in software (by not using irq work for
this), but yuck.
yep, but hopefuly it's the issue microcode ;-) Cc-ing guys from linux-firmware git

Sherry, Suravee, any idea?

thanks,
jirka
Jiri,
I have duplicated your problem and asked the HW architect that wrote 832 to review the diff between the 822 and 832 microcode patch.

Thanks,
Sherry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/