Fatal signal handling within uaccess faults

From: Mark Rutland
Date: Thu Jan 21 2021 - 07:35:22 EST


Hi all,

Arch maintainers, if you are Cc'd I believe your architecture has a
user-triggerable livelock; please see below for details. Apologies in
advance if I am mistaken!

I believe the following architectures ARE affected:

alpha, hexagon, ia64, m68k, microblaze, mips, nios2, openrisc,
parisc, riscv, sparc (32 & 64), xtensa

I believe the following architectures ARE NOT affected:

arc, arm, arm64, c6x, h8300, nds32, powerpc, s390, sh, x86

... and csky has a fix pending as of today.

The issue is that in the fault handling path, architectures have a check
with the shape:

| if (fault_signal_pending(fault, regs))
| return;

... where if a uaccess (e.g. get_user()) triggers a fault and there's a
fault signal pending, the handler will return to the uaccess without
having performed a uaccess fault fixup, and so the CPU will immediately
execute the uaccess instruction again, whereupon it will livelock
bouncing between that instruction and the fault handler.

The architectures (with an MMU) which are not affected apply the uaccess
fixup, and so return to the error handler for the uaccess, and make
forward progress. Usually, this looks something like:

| if (fault_signal_pending(fault, regs)) {
| if (!user_mode(regs))
| goto no_context; // or fixup_exception(...), etc
| return;
| }

I believe similar changes need to be made to each of the architectures
I've listed as affected above.

This was previously reported back in July 2017:

https://lore.kernel.org/lkml/20170822102527.GA14671@leverpostej/

... but it looks like it looks like that wasn't sufficiently visible, so
I'm poking folk explicitly this time around.

I believe that this can be triggered with the test case below,
duplicated from the previous posting. If the architecture is affected,
it will not be possible to kill the test program with any signal.

Thanks,
Mark.

---->8----
#include <errno.h>
#include <linux/userfaultfd.h>
#include <stdio.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/vfs.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
void *mem;
long pagesz;
int uffd, ret;
struct uffdio_api api = {
.api = UFFD_API
};
struct uffdio_register reg;

pagesz = sysconf(_SC_PAGESIZE);
if (pagesz < 0) {
return errno;
}

mem = mmap(NULL, pagesz, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (mem == MAP_FAILED)
return errno;

uffd = syscall(__NR_userfaultfd, 0);
if (uffd < 0)
return errno;

ret = ioctl(uffd, UFFDIO_API, &api);
if (ret < 0)
return errno;

reg = (struct uffdio_register) {
.range = {
.start = (unsigned long)mem,
.len = pagesz
},
.mode = UFFDIO_REGISTER_MODE_MISSING
};

ret = ioctl(uffd, UFFDIO_REGISTER, &reg);
if (ret < 0)
return errno;

/*
* Force an arbitrary uaccess to memory monitored by the userfaultfd.
* This will block, but when a SIGKILL is sent, will consume all
* available CPU time without being killed, and may inhibit forward
* progress of the system.
*/
ret = fstatfs(0, (struct statfs *)mem);

return 0;
}