Re: [PATCH] kvm: fix waitqueue_active without memory barrier invirt/kvm/async_pf.c

From: Kosuke Tatsukawa
Date: Fri Oct 09 2015 - 05:05:14 EST


Paolo Bonzini wrote:
> On 09/10/2015 02:35, Kosuke Tatsukawa wrote:
>> async_pf_execute kvm_vcpu_block
>> ------------------------------------------------------------------------
>> spin_lock(&vcpu->async_pf.lock);
>> if (waitqueue_active(&vcpu->wq))
>> /* The CPU might reorder the test for
>> the waitqueue up here, before
>> prior writes complete */
>> prepare_to_wait(&vcpu->wq, &wait,
>> TASK_INTERRUPTIBLE);
>> /*if (kvm_vcpu_check_block(vcpu) < 0) */
>> /*if (kvm_arch_vcpu_runnable(vcpu)) { */
>> ...
>> return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
>> !vcpu->arch.apf.halted)
>> || !list_empty_careful(&vcpu->async_pf.done)
>> ...
>
> The new memory barrier isn't "paired" with any other, and in
> fact I think that the same issue exists on the other side:
> list_empty_careful(&vcpu->async_pf.done) may be reordered up,
> before the prepare_to_wait:

smp_store_mb() called from set_current_state(), which is called from
prepare_to_wait() should prevent reordering such as below from
happening. wait_event*() also calls set_current_state() inside.


> spin_lock(&vcpu->async_pf.lock);
> (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
> !vcpu->arch.apf.halted)
> || !list_empty_careful(&vcpu->async_pf.done)
> ...
> prepare_to_wait(&vcpu->wq, &wait,
> TASK_INTERRUPTIBLE);
> /*if (kvm_vcpu_check_block(vcpu) < 0) */
> /*if (kvm_arch_vcpu_runnable(vcpu)) { */
> ...
> return 0;
> list_add_tail(&apf->link,
> &vcpu->async_pf.done);
> spin_unlock(&vcpu->async_pf.lock);
> waited = true;
> schedule();
> if (waitqueue_active(&vcpu->wq))
>
> So you need another smp_mb() after prepare_to_wait(). I'm not sure
> if it's needed also for your original tty report, but I think it is
> for https://lkml.org/lkml/2015/10/8/989 ("mei: fix waitqueue_active
> without memory barrier in mei drivers").
>
> I wonder if it makes sense to introduce smp_mb__before_spin_lock()
> and smp_mb__after_spin_unlock(). On x86 the former could be a
> simple compiler barrier, and on s390 both of them could. But that
> should be a separate patch.

The problem on the waiter side occurs if add_wait_queue() or
__add_wait_queue() is directly called without memory barriers nor locks
protecting it.


> Thanks,
>
> Paolo
---
Kosuke TATSUKAWA | 3rd IT Platform Department
| IT Platform Division, NEC Corporation
| tatsu@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/