Re: [RFC patch 00/19] bpf: Make BPF and PREEMPT_RT co-exist

From: Thomas Gleixner
Date: Fri Feb 14 2020 - 13:37:06 EST


David Miller <davem@xxxxxxxxxxxxx> writes:

> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Date: Fri, 14 Feb 2020 14:39:17 +0100
>
>> This is a follow up to the initial patch series which David posted a
>> while ago:
>>
>> https://lore.kernel.org/bpf/20191207.160357.828344895192682546.davem@xxxxxxxxxxxxx/
>>
>> which was (while non-functional on RT) a good starting point for further
>> investigations.
>
> This looks really good after a cursory review, thanks for doing this week.
>
> I was personally unaware of the pre-allocation rules for MAPs used by
> tracing et al. And that definitely shapes how this should be handled.

Hmm. I just noticed that my analysis only holds for PERF events. But
that's broken on mainline already.

Assume the following simplified callchain:

kmalloc() from regular non BPF context
cache empty
freelist empty
lock(zone->lock);
tracepoint or kprobe
BPF()
update_elem()
lock(bucket)
kmalloc()
cache empty
freelist empty
lock(zone->lock); <- DEADLOCK

So really, preallocation _must_ be enforced for all variants of
intrusive instrumentation. There is no if and but, it's simply mandatory
as all intrusive instrumentation has to follow the only sensible
principle: KISS = Keep It Safe and Simple.

The above is a perfectly valid scenario and works with perf and tracing,
so it has to work with BPF in the same safe way.

I might be missing some magic enforcement of that, but I got lost in the
maze.

Thanks,

tglx