[PATCH v3 0/2] Handle seccomp notification preemption

From: Sargun Dhillon
Date: Thu Apr 28 2022 - 22:32:27 EST


This patchset addresses a race condition we've dealt with recently with
seccomp. Specifically programs interrupting syscalls while they're in
progress. This was exacerbated by Golang's[1] recent adoption of
"Non-cooperative goroutine preemption", in which they try to interrupt any
syscall that's been running for more than 10ms. During certain syscalls,
it's non-trivial to write them in a reetrant manner in userspace (mount).

It allows a per-filter flag to be set that makes it so that the notifying
process will switch to "TASK_KILLABLE" as opposed to returning to userspace
on non-fatal signals.

Changes since v2[3]:
* Split out addfd patches
* Move the flag to be per-filter (as opposed to per notification)

Changes since v1[2]:
* Fix some documentation
* Add Rata's patches to allow for direct return from addfd

[1]: https://github.com/golang/proposal/blob/master/design/24543-non-cooperative-preemption.md
[2]: https://lore.kernel.org/lkml/20210220090502.7202-1-sargun@xxxxxxxxx/
[3]: https://lore.kernel.org/all/20210426180610.2363-1-sargun@xxxxxxxxx/

Sargun Dhillon (2):
seccomp: Add wait_killable semantic to seccomp user notifier
selftests/seccomp: Add test for wait killable notifier

.../userspace-api/seccomp_filter.rst | 8 +
include/linux/seccomp.h | 3 +-
include/uapi/linux/seccomp.h | 2 +
kernel/seccomp.c | 42 ++-
tools/testing/selftests/seccomp/seccomp_bpf.c | 240 ++++++++++++++++++
5 files changed, 292 insertions(+), 3 deletions(-)

--
2.25.1