Re: [PATCH][V3] arm64: perf: Get the wrong PC value in REGS_ABI_32 mode

From: Mark Rutland
Date: Thu Jun 18 2020 - 09:03:51 EST


On Thu, May 28, 2020 at 08:54:19AM +0100, Will Deacon wrote:
> On Thu, May 28, 2020 at 09:06:07AM +0800, Jiping Ma wrote:
> > On 05/27/2020 11:19 PM, Mark Rutland wrote:
> > > On Wed, May 27, 2020 at 09:33:00AM +0800, Jiping Ma wrote:
> > > > On 05/26/2020 06:26 PM, Mark Rutland wrote:
> > > > > On Mon, May 11, 2020 at 10:52:07AM +0800, Jiping Ma wrote:
> > > > This modification can not fix our issue,ÃÂÂ we need
> > > > perf_reg_abi(current) == PERF_SAMPLE_REGS_ABI_32 to judge if it is 32-bit
> > > > task or not,
> > > > then return the correct PC value.
> > > I must be missing something here.
> > >
> > > The core code perf_reg_abi(task) is called with the task being sampled,
> > > and the regs are from the task being sampled. For a userspace sample for
> > > a compat task, compat_user_mode(regs) should be equivalent to the
> > > is_compat_thread(task_thread_info(task)) check.
> > >
> > > What am I missing?
> > This issue caused by PC value is not correct. regs are sampled in function
> > perf_output_sample_regs, that call perf_reg_value(regs, bit) to get PC
> > value.
> > PC value is regs[15] in perf_reg_value() function. it should be regs[32].
> >
> > perf_output_sample_regs(struct perf_output_handle *handle,
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ struct pt_regs *regs, u64 mask)
> > {
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ int bit;
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ DECLARE_BITMAP(_mask, 64);
> >
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ bitmap_from_u64(_mask, mask);
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ for_each_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) {
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ u64 val;
> >
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ val = perf_reg_value(regs, bit);
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ perf_output_put(handle, val);
> > ÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂÃÂÂ }
> > }
>
> Yes, but Mark's point is that checking 'compat_user_mode(regs)' should be
> exactly the same as checking 'perf_reg_abi(current) == PERF_SAMPLE_REGS_ABI_32'.
> Are you saying that's not the case? If so, please can you provide an example
> of when they are different?
>
> Leaving that aside for a second, I also think it's reasonable to question
> whether this whole interface is busted or not. I looked at it last night but
> struggled to work out what it's supposed to do. Consider these three
> scenarios, all under an arm64 kernel:
>
> 1. 64-bit perf + 64-bit application being profiled
> 2. 64-bit perf + 32-bit application being profiled
> 3. 32-bit perf + 32-bit application being profiled
>
> It looks like the current code is a bodge to try to handle both (2) and
> (3) at the same time:
>
> - In case (3), userspace only asks about registers 0-15
> - In case (2), we fudge the higher registers so that 64-bit SP and LR
> hold the 32-bit values as a bodge to allow a 64-bit dwarf unwinder
> to unwind the stack

I think the fudging is nonsensical to begin with, as I would have
expected that PERF_REGS_ABI_32 should be the same layout regardless of
consumer (and therefore should be identical to the 32-bit arm native
format). I realise that doesn't change that we might be stuck with it...

> So the idea behind the patch looks fine because case (3) is expecting the PC
> in register 15 and instead gets 0, but the temptation is to clean this up so
> that cases (2) and (3) report the same data to userspace (along the lines of
> Mark's patch), namely only the first 16 registers with the PC moved down. We
> can only do that if the unwinder is happy, which it might be if it only ever
> looks up dwarf register numbers based on the unwind tables in the binary.
> Somebody would need to dig into that.

Agreed; I will try to figure out what the perf tool does in the three
cases above. I would be grateful if others could take a look too.

Another slightly scary thought: what happens for a 32-bit perf with a
64-bit application being profiled? I don't see how that'd be forbidden,
but I also don't see how it'd work.

> Otherwise, if it generates unconditional references to things like
> register 30 to grab the link register, then we're stuck with the bodge
> and need to special-case the PC.

I agree that in that case we'd have to keep the existing bodge, and we'd
have to special-case the PC, but I'd prefer to split the logic for case
1 into a separate function for cases 2 and 3 so that we can more easily
avoid getting this more confused.

Let's figure out what userspace does first...

Thanks,
Mark.