Re: BUG: unable to handle kernel paging request in bpf_int_jit_compile

From: Ingo Molnar
Date: Sun Jun 24 2018 - 06:03:00 EST



* David Miller <davem@xxxxxxxxxxxxx> wrote:

> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Date: Sun, 24 Jun 2018 09:09:09 +0200 (CEST)
>
> > I'm really tempted to make the BPF config switch depend on BROKEN.
>
> This really isn't necessary Thomas.
>
> Whoever wrote the code didn't understand that set ro can legitimately
> fail.

No, that's *NOT* the only thing that happened, according to the Git history.

The first use of set_memory_ro() in include/linux/filter.h was added by
this commit almost four years ago:

# 2014/09
60a3b2253c41 ("net: bpf: make eBPF interpreter images read-only")

... and yes, that commit didn't anticipate the (in hindsight) obvious property of
a function that changes global kernel mappings that if it is used after bootup
without locking it 'may fail'. So that commit slipping through is 'shit happens'
and I don't think we ever complained about such things slipping through.

But what happened after that is not so good:

A bit over two years later a crash was found:

Eric and Willem reported that they recently saw random crashes when
JIT was in use and bisected this to 74451e66d516 ("bpf: make jited
programs visible in traces"). Issue was that the consolidation part
added bpf_jit_binary_unlock_ro() that would unlock previously made
read-only memory back to read-write.

... but instead of fixing it for real, it was only tinkered with:

# 2017//02
9d876e79df6a ("bpf: fix unlocking of jited image when module ronx not set")

... but the problems persisted:

Improve bpf_{prog,jit_binary}_{un,}lock_ro() by throwing a
one-time warning in case of an error when the image couldn't
be set read-only, and also mark struct bpf_prog as locked when
bpf_prog_lock_ro() was called.

... so the warnings Thomas complained about here were then added a month later:

# 2017/03
65869a47f348 ("bpf: improve read-only handling")

It 'improved' nothing of the sort, and the warnings and 'debug code' shows that
the author was aware that these functions could actually fail. To quote the fine
code, introduced a year ago:

WARN_ON_ONCE(set_memory_rw((unsigned long)fp, fp->pages));
/* In case set_memory_rw() fails, we want to be the first
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* to crash here instead of some random place later on.
*/
fp->locked = 0;

... and then, this month, it was tweaked *YET ANOTHER TIME*:

bpf: reject any prog that failed read-only lock

We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro()
as well as the BPF image as read-only through bpf_prog_lock_ro(). In
the case any of these would fail we throw a WARN_ON_ONCE() in order to
yell loudly to the log. Perhaps, to some extend, this may be comparable
to an allocation where __GFP_NOWARN is explicitly not set.

# 2018/06
9facc336876f ("bpf: reject any prog that failed read-only lock")

The tone of uncertainty of the changelog, combined with the unfixed typo in it,
suggests that this commit too was just waved through to upstream without any real
review and without much design thinking behind it.

And yes, this was still not the right fix, as the fuzzer crash reported in this
thread outlines - we'll probably need a 5th commit?

> So let's correct that instead of flaming a feature.

So accusing Thomas of 'flaming a feature' is a really unfair attack in light of
all the details above.

Thanks,

Ingo