Re: kernel panic: Attempted to kill init!

From: Alexei Starovoitov
Date: Tue Jan 03 2023 - 13:34:08 EST


On Tue, Jan 3, 2023 at 4:46 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote:
>
>
>
> > On 31 Dec 2022, at 12:55 AM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Fri, Dec 30, 2022 at 1:54 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote:
> >>
> >>
> >>
> >>> On 28 Dec 2022, at 2:35 PM, Yonghong Song <yhs@xxxxxxxx> wrote:
> >>>
> >>>
> >>>
> >>> On 12/21/22 8:35 PM, Hao Sun wrote:
> >>>> Hi,
> >>>> This crash can be triggered by executing the C reproducer for
> >>>> multiple times, which just keep loading the following prog as
> >>>> raw tracepoint into kmem_cache_free().
> >>>> The prog send SIGSEGV to current via bpf_send_signal_thread(),
> >>>> after load this, whoever tries to free mem would trigger this,
> >>>> kernel crashed when this happens to init.
> >>>> Seems we should filter init out in bpf_send_signal_common() by
> >>>> is_global_init(current), or maybe we should check this in the
> >>>> verifier?
> >>>
> >>> The helper is just to send a particular signal to *current*
> >>> thread. In typical use case, it is never a good idea to send
> >>> the signal to a *random* thread. In certain cases, maybe user
> >>> indeed wants to send the signal to init thread to observe
> >>> something. Note that such destructive side effect already
> >>> exists in the bpf land. For example, for a xdp program,
> >>> it could drop all packets to make machine not responsive
> >>> to ssh etc. Therefore, I recommend to keep the existing
> >>> bpf_send_signal_common() helper behavior.
> >>
> >> Sound the two are different cases. Not responsive in XDP seems like
> >> an intended behaviour, panic caused by killing init is buggy. If the
> >> last thread of global init was killed, kernel panic immediately.
> >
> > I don't get it. How was it possible that this prog was
> > executed with current == pid 1 ?
>
> The prog is raw trace point and is attached to ‘kmem_cache_free’ event.
> When init triggered the event, the prog would be executed with pid 1.
> But, the reason of this crash is not very clear to me, because it’s
> really hard to debug with original C reproducer.
>
> The following is the corresponding Syz prog:
>
> # {Threaded:true Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:none SandboxArg:0 Leak:false NetInjection:true NetDevices:true NetReset:true Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:true Sysctl:true UseTmpDir:true HandleSegv:true Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
> r0 = bpf$BPF_PROG_RAW_TRACEPOINT_LOAD(0x5, &(0x7f0000000000)={0x11, 0xe, &(0x7f0000000400)=ANY=[@ANYBLOB="18000000000000000000000000000000180600000000000000000000000000001807000000000000000000000000000018080000000000000000000000000000180900000000000000000000000000002d00020000000000b70100000b000000850000007500000095"], &(0x7f00000000c0)}, 0x80)
> bpf$BPF_RAW_TRACEPOINT_OPEN(0x11, &(0x7f0000000100)={&(0x7f0000000080)='kmem_cache_free\x00', r0}, 0x10)

Does syzbot running without any user space?
Is syzbot itself a pid=1 ? and the only process ?
If so, the error would makes sense.
I guess we can add a safety check to bpf_send_signal_common
to prevent syzbot from killing itself.