Agree :)Your patch doesn't cure the problem.
rcu_read_lock just disables preemtion and rcu_dereference
introduces memory barrier. _None_ of this _prevents_
another CPU from freeing old real_parent in parallel with your dereference.
How so? Note that release_task() doesn't call put_task_struct(), it does
call_rcu(&p->rcu, delayed_put_task_struct) instead. When delayed_put_task_struct()
is called, all CPUs must see the new value of ->real_parent (otherwise
RCU is just broken). If CPU sees the old value of ->real_parent, rcu_read_lock()
protects us from delayed_put_task_struct() on another CPU.
Ok, I think this is the same "classic" pattern as:
old = global_ptr;
global_ptr = new;
call_rcu(..free_old...);
vs
rcu_read_lock();
use(global_ptr);
rcu_read_unlock();
Do you agree?