[patch V3 01/22] bpf: Tighten the requirements for preallocated hash maps

From: Thomas Gleixner
Date: Mon Feb 24 2020 - 10:05:25 EST


The assumption that only programs attached to perf NMI events can deadlock
on memory allocators is wrong. Assume the following simplified callchain:

kmalloc() from regular non BPF context
cache empty
freelist empty
lock(zone->lock);
tracepoint or kprobe
BPF()
update_elem()
lock(bucket)
kmalloc()
cache empty
freelist empty
lock(zone->lock); <- DEADLOCK

There are other ways which do not involve locking to create wreckage:

kmalloc() from regular non BPF context
local_irq_save();
...
obj = slab_first();
kprobe()
BPF()
update_elem()
lock(bucket)
kmalloc()
local_irq_save();
...
obj = slab_first(); <- Same object as above ...

So preallocation _must_ be enforced for all variants of intrusive
instrumentation.

Unfortunately immediate enforcement would break backwards compatibility, so
for now such programs still are allowed to run, but a one time warning is
emitted in dmesg and the verifier emits a warning in the verifier log as
well so developers are made aware about this and can fix their programs
before the enforcement becomes mandatory.

Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
V3: Still allow run-time allocation for !RT. Emit warnings. Split
out the RT part as this really should be backported to stable
kernels.
V2: New patch
---
kernel/bpf/verifier.c | 39 ++++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 11 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8143,26 +8143,43 @@ static bool is_tracing_prog_type(enum bp
}
}

+static bool is_preallocated_map(struct bpf_map *map)
+{
+ if (!check_map_prealloc(map))
+ return false;
+ if (map->inner_map_meta && !check_map_prealloc(map->inner_map_meta))
+ return false;
+ return true;
+}
+
static int check_map_prog_compatibility(struct bpf_verifier_env *env,
struct bpf_map *map,
struct bpf_prog *prog)

{
- /* Make sure that BPF_PROG_TYPE_PERF_EVENT programs only use
- * preallocated hash maps, since doing memory allocation
- * in overflow_handler can crash depending on where nmi got
- * triggered.
+ /*
+ * Validate that trace type programs use preallocated hash maps.
+ *
+ * For programs attached to PERF events this is mandatory as the
+ * perf NMI can hit any arbitrary code sequence.
+ *
+ * All other trace types using preallocated hash maps are unsafe as
+ * well because tracepoint or kprobes can be inside locked regions
+ * of the memory allocator or at a place where a recursion into the
+ * memory allocator would see inconsistent state.
+ *
+ * For now running such programs is allowed for backwards
+ * compatibility reasons, but warnings are emitted so developers are
+ * made aware of the unsafety and can fix their programs before this
+ * is enforced.
*/
- if (prog->type == BPF_PROG_TYPE_PERF_EVENT) {
- if (!check_map_prealloc(map)) {
+ if (is_tracing_prog_type(prog->type) && !is_preallocated_map(map)) {
+ if (prog->type == BPF_PROG_TYPE_PERF_EVENT) {
verbose(env, "perf_event programs can only use preallocated hash map\n");
return -EINVAL;
}
- if (map->inner_map_meta &&
- !check_map_prealloc(map->inner_map_meta)) {
- verbose(env, "perf_event programs can only use preallocated inner hash map\n");
- return -EINVAL;
- }
+ WARN_ONCE(1, "trace type BPF program uses run-time allocation\n");
+ verbose(env, "trace type programs with run-time allocated hash maps are unsafe. Switch to preallocated hash maps.\n");
}

if ((is_tracing_prog_type(prog->type) ||