ring_buffer_attach && cond_synchronize_rcu (Was: percpu-rwsem: Optimize readers and reduce global impact)

From: Oleg Nesterov
Date: Sat May 30 2015 - 16:05:39 EST


On 05/30, Paul E. McKenney wrote:
>
> But it looks like you need the RCU-sched variant. Please see below for
> an untested patch providing this support. One benefit of this patch
> is that it does not add any bloat to Tiny RCU.

I don't think so, see another email. But perhaps I am totally confused,
please correct me.

Well, actually the first writer (need_sync == T) can use it, but it
does not make sense, I think. Because it calls sync() right after
it observes GP_IDLE and drops the lock, the window is too small.

> ------------------------------------------------------------------------
>
> rcu: Add RCU-sched flavors of get-state and cond-sync

However, to me this patch makes sense anyway. Just I don't think rcu_sync
or percpu_rw_semaphore can use the new helpers.


And. I tried to find other users of get_state/cond_sync. Found
ring_buffer_attach() and it looks obviously buggy?

Again, perhaps I am totally confused, but don't we need to ensure
that we have "synchronize" _between_ list_del() and list_add() ?

IOW. Suppose that ring_buffer_attach() preempts right_after
get_state_synchronize_rcu() and gp completes before spin_lock().

In this case cond_synchronize_rcu() does nothing and we reuse
->rb_entry without waiting for gp in between?

Don't we need the patch below? (it also moves the ->rcu_pending check
under "if (rb)", to make it more readable imo).

Peter?

Oleg.

--- x/kernel/events/core.c
+++ x/kernel/events/core.c
@@ -4310,20 +4310,20 @@ static void ring_buffer_attach(struct pe
WARN_ON_ONCE(event->rcu_pending);

old_rb = event->rb;
- event->rcu_batches = get_state_synchronize_rcu();
- event->rcu_pending = 1;
-
spin_lock_irqsave(&old_rb->event_lock, flags);
list_del_rcu(&event->rb_entry);
spin_unlock_irqrestore(&old_rb->event_lock, flags);
- }

- if (event->rcu_pending && rb) {
- cond_synchronize_rcu(event->rcu_batches);
- event->rcu_pending = 0;
+ event->rcu_batches = get_state_synchronize_rcu();
+ event->rcu_pending = 1;
}

if (rb) {
+ if (event->rcu_pending) {
+ cond_synchronize_rcu(event->rcu_batches);
+ event->rcu_pending = 0;
+ }
+
spin_lock_irqsave(&rb->event_lock, flags);
list_add_rcu(&event->rb_entry, &rb->event_list);
spin_unlock_irqrestore(&rb->event_lock, flags);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/