Re: kvm: use-after-free in process_srcu

From: Paolo Bonzini
Date: Tue Jan 17 2017 - 06:09:52 EST




On 17/01/2017 10:56, Dmitry Vyukov wrote:
>> I am seeing use-after-frees in process_srcu as struct srcu_struct is
>> already freed. Before freeing struct srcu_struct, code does
>> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do:
>>
>> + srcu_barrier(&kvm->irq_srcu);
>> cleanup_srcu_struct(&kvm->irq_srcu);
>>
>> It reduced rate of use-after-frees, but did not eliminate them
>> completely. The full threaded is here:
>> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ
>>
>> Does Paolo's fix above make sense to you? Namely adding
>> flush_delayed_work(&sp->work) to cleanup_srcu_struct()?
>
> I am not sure about interaction of flush_delayed_work and
> srcu_reschedule... flush_delayed_work probably assumes that no work is
> queued concurrently, but what if srcu_reschedule queues another work
> concurrently... can't it happen that flush_delayed_work will miss that
> newly scheduled work?

Newly scheduled callbacks would be a bug in SRCU usage, but my patch is
indeed insufficient. Because of SRCU's two-phase algorithm, it's possible
that the first flush_delayed_work doesn't invoke all callbacks. Instead I
would propose this (still untested, but this time with a commit message):

---------------- 8< --------------
From: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up"

Even though there are no concurrent readers, it is possible that the
work item is queued for delayed processing when cleanup_srcu_struct is
called. The work item needs to be flushed before returning, or a
use-after-free can ensue.

Furthermore, because of SRCU's two-phase algorithm it may take up to
two executions of srcu_advance_batches before all callbacks are invoked.
This can happen if the first flush_delayed_work happens as follows

srcu_read_lock
process_srcu
srcu_advance_batches
...
if (!try_check_zero(sp, idx^1, trycount))
// there is a reader
return;
srcu_invoke_callbacks
...
srcu_read_unlock
cleanup_srcu_struct
flush_delayed_work
srcu_reschedule
queue_delayed_work

Now flush_delayed_work returns but srcu_reschedule will *not* have cleared
sp->running to false.

Not-tested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index 9b9cdd549caa..9470f1ba2ef2 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp)
{
if (WARN_ON(srcu_readers_active(sp)))
return; /* Leakage unless caller handles error. */
+
+ /*
+ * No readers active, so any pending callbacks will rush through the two
+ * batches before sp->running becomes false. No risk of busy-waiting.
+ */
+ while (sp->running)
+ flush_delayed_work(&sp->work);
+
free_percpu(sp->per_cpu_ref);
sp->per_cpu_ref = NULL;
}


Thanks,

Paolo