Re: [RFC][PATCH] x86,ibt: Use UDB instead of 0xEA
From: Leon Hwang
Date: Fri Aug 15 2025 - 04:29:03 EST
On 15/8/25 15:57, Peter Zijlstra wrote:
> On Fri, Aug 15, 2025 at 08:42:39AM +0300, Alexei Starovoitov wrote:
>> On Thu, Aug 14, 2025 at 2:17 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>>
>>> Hi!
>>>
>>> A while ago FineIBT started using the instruction 0xEA to generate #UD.
>>> All existing parts will generate #UD in 64bit mode on that instruction.
>>>
>>> However; Intel/AMD have not blessed using this instruction, it is on
>>> their 'reserved' list for future use.
>>>
>>> Peter Anvin worked the committees and got use of 0xD6 blessed, and it
>>> will be called UDB (per the next SDM or so).
>>>
>>> Reworking the FineIBT code to use UDB wasn't entirely trivial, and I've
>>> had to switch the hash register to EAX in order to free up some bytes.
>>>
>>> Per the x86_64 ABI, EAX is used to pass the number of vector registers
>>> for varargs -- something that should not happen in the kernel. More so,
>>> we build with -mskip-rax-setup, which should leave EAX completely unused
>>> in the calling convention.
>>
>> rax is used to pass tail_call count.
>> See diagram in commit log:
>> https://lore.kernel.org/all/20240714123902.32305-2-hffilwlqm@xxxxxxxxx/
>> Before that commit rax was used differently.
>> Bottom line rax was used for a long time to support bpf_tail_calls.
>> I'm traveling atm. So cc-ing folks for follow ups.
>
> IIRC the bpf2bpf tailcall doesn't use CFI at the moment. But let me
> double check.
>
> So emit_cfi() is called at the very start of emit_prologue() and
> __arch_prepare_bpf_trampoline() in the BPF_TRAMP_F_INDIRECT case.
>
> Now, emit_prologue() starts with the CFI bits, but the tailcall lands at
> X86_TAIL_CALL_OFFSET, at which spot we only have EMIT_ENDBR(), nothing
> else. So RAX should be unaffected at that point.
>
> So, AFAICT, we're good on that point. It is just the C level indirect
> function call ABI that is affected, BPF internal conventions are
> unaffected.
>
RAX is used for propagating tail_call_cnt_ptr from caller to callee for
bpf2bpf+tailcall on x86_64.
Before the aforementioned commit, RAX is used for propagating
tail_call_cnt from caller to callee for the case.
Thanks,
Leon