Re: KASAN: use-after-free Read in ___bpf_prog_run

From: Hao Sun
Date: Mon Jan 16 2023 - 07:45:36 EST




> On 12 Jan 2023, at 2:59 PM, Yonghong Song <yhs@xxxxxxxx> wrote:
>
>
>
> On 1/9/23 5:21 AM, Hao Sun wrote:
>> Yonghong Song <yhs@xxxxxxxx> 于2022年12月18日周日 00:57写道:
>>>
>>>
>>>
>>> On 12/16/22 10:54 PM, Hao Sun wrote:
>>>>
>>>>
>>>>> On 17 Dec 2022, at 1:07 PM, Yonghong Song <yhs@xxxxxxxx> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 12/14/22 11:49 PM, Hao Sun wrote:
>>>>>> Hi,
>>>>>> The following KASAN report can be triggered by loading and test
>>>>>> running this simple BPF prog with a random data/ctx:
>>>>>> 0: r0 = bpf_get_current_task_btf ;
>>>>>> R0_w=trusted_ptr_task_struct(off=0,imm=0)
>>>>>> 1: r0 = *(u32 *)(r0 +8192) ;
>>>>>> R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>>>>>> 2: exit
>>>>>> I've simplified the C reproducer but didn't find the root cause.
>>>>>> JIT was disabled, and the interpreter triggered UAF when executing
>>>>>> the load insn. A slab-out-of-bound read can also be triggered:
>>>>>> https://pastebin.com/raw/g9zXr8jU
>>>>>> This can be reproduced on:
>>>>>> HEAD commit: b148c8b9b926 selftests/bpf: Add few corner cases to test
>>>>>> padding handling of btf_dump
>>>>>> git tree: bpf-next
>>>>>> console log: https://pastebin.com/raw/1EUi9tJe
>>>>>> kernel config: https://pastebin.com/raw/rgY3AJDZ
>>>>>> C reproducer: https://pastebin.com/raw/cfVGuCBm
>>>>>
>>>>> I I tried with your above kernel config and C reproducer and cannot reproduce the kasan issue you reported.
>>>>>
>>>>> [root@arch-fb-vm1 bpf-next]# ./a.out
>>>>> func#0 @0
>>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>>> 0: (85) call bpf_get_current_task_btf#158 ; R0_w=trusted_ptr_task_struct(off=0,imm=0)
>>>>> 1: (61) r0 = *(u32 *)(r0 +8192) ; R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>>>>> 2: (95) exit
>>>>> processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>>>>
>>>>> prog fd: 3
>>>>> [root@arch-fb-vm1 bpf-next]#
>>>>>
>>>>> Your config indeed has kasan on.
>>>>
>>>> Hi,
>>>>
>>>> I can still reproduce this on a latest bpf-next build: 0e43662e61f25
>>>> (“tools/resolve_btfids: Use pkg-config to locate libelf”).
>>>> The simplified C reproducer sometime need to be run twice to trigger
>>>> the UAF. Also note that interpreter is required. Here is the original
>>>> C reproducer that loads and runs the BPF prog continuously for your
>>>> convenience:
>>>> https://pastebin.com/raw/WSJuNnVU
>>>>
>>>
>>> I still cannot reproduce with more than 10 runs. The config has jit off
>>> so it already uses interpreter. It has kasan on as well.
>>> # CONFIG_BPF_JIT is not set
>>>
>>> Since you can reproduce it, I guess it would be great if you can
>>> continue to debug this.
>>>
>> The load insn ‘r0 = *(u32*) (current + 8192)’ is OOB, because sizeof(task_struct)
>> is 7240 as shown in KASAN report. The issue is that struct task_struct is special,
>> its runtime size is actually smaller than it static type size. In X86:
>> task_struct->thread_struct->fpu->fpstate->union fpregs_state is
>> /*
>> * ...
>> * The size of the structure is determined by the largest
>> * member - which is the xsave area. The padding is there
>> * to ensure that statically-allocated task_structs (just
>> * the init_task today) have enough space.
>> */
>> union fpregs_state {
>> struct fregs_state fsave;
>> struct fxregs_state fxsave;
>> struct swregs_state soft;
>> struct xregs_state xsave;
>> u8 __padding[PAGE_SIZE];
>> };
>> In btf_struct_access(), the resolved size for task_struct is 10496, much bigger
>> than its runtime size, so the prog in reproducer passed the verifier and leads
>> to the oob. This can happen to all similar types, whose runtime size is smaller
>> than its static size.
>> Not sure how many similar cases are there, maybe special check to task_struct
>> is enough. Any hint on how this should be addressed?
>
> This should a corner case, I am not aware of other allocations like this.
>
> For a normal program, if the access chain looks
> like
> task_struct->thread_struct->fpu->fpstate->fpregs_state->{fsave,fxsave, soft, xsave},
> we should not hit this issue. So I think we don't need to address this
> issue in kernel. The test itself should filter this out.

Maybe I didn’t make my point clear. The issue here is that the runtime size
of task_struct is `arch_task_struct_size`, which equals to the following,
see fpu__init_task_struct_size():

sizeof(task_struct) - sizeof(fpregs_state) + fpu_kernel_cfg.default_size

However, the verifier validates task_struct access based on the ty info in
btf_vmlinux, which equals to sizeof(task_struct), much bigger than `arch_
task_struct_size` due to the `__padding[PAGE_SIZE]` in fpregs_state, this
leads to OOB access.

We should not allow progs that access the task_struct beyond `arch_task_
struct_size` to be loaded. So maybe we should add a fixup in `btf_parse_
vmlinux()` to assign the correct size to each type of this chain:
task_struct->thread_struct->fpu->fpstate->fpregs_state;
Or maybe we should fix this during generating BTF of vmlinux?