Re: [PATCH] static_key: fix concurrent static_key_slow_inc

From: Christian Borntraeger
Date: Wed Jun 22 2016 - 04:50:24 EST


On 06/21/2016 06:52 PM, Paolo Bonzini wrote:
> The following scenario is possible:
>
> CPU 1 CPU 2
> static_key_slow_inc
> atomic_inc_not_zero
> -> key.enabled == 0, no increment
> jump_label_lock
> atomic_inc_return
> -> key.enabled == 1 now
> static_key_slow_inc
> atomic_inc_not_zero
> -> key.enabled == 1, inc to 2
> return
> ** static key is wrong!
> jump_label_update
> jump_label_unlock
>
> Testing the static key at the point marked by (**) will follow the wrong
> path for jumps that have not been patched yet. This can actually happen
> when creating many KVM virtual machines with userspace LAPIC emulation;
> just run several copies of the following program:
>
> #include <fcntl.h>
> #include <unistd.h>
> #include <sys/ioctl.h>
> #include <linux/kvm.h>
>
> int main(void)
> {
> for (;;) {
> int kvmfd = open("/dev/kvm", O_RDONLY);
> int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
> close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
> close(vmfd);
> close(kvmfd);
> }
> return 0;
> }
>
> Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc. The
> static key's purpose is to skip NULL pointer checks and indeed one of
> the processes eventually dereferences NULL.


Interesting. Some time ago I had a spurious bug on the preempt_notifier
when starting/stopping lots of guests, but I was never able to reliably
reproduce it. I was chasing some other bug, so I did not even considered
static_key to be broken, but this might actually be the fix for that
problem.