Re: [PATCH 8/8] fs/aio: Use rcu_work instead of explicit rcu and work item

From: Oleg Nesterov
Date: Thu Mar 22 2018 - 07:24:21 EST


On 03/21, Tejun Heo wrote:
>
> Hello,
>
> On Wed, Mar 21, 2018 at 06:17:43PM +0100, Oleg Nesterov wrote:
> > Mostly I am asking because I do not really understand
> > "[PATCH 6/8] RCU, workqueue: Implement rcu_work".
> >
> > I mean, the code looks simple and correct but why does it play with
> > WORK_STRUCT_PENDING_BIT? IOW, I do not see a "good" use-case when 2 or more
> > queue_rcu_work()'s can use the same rwork and hit work_pending() == T. And
> > what the caller should do if queue_rcu_work() returns false?
>
> It's just following standard workqueue conventions.

OK. And I agree that the usage of WORK_STRUCT_PENDING_BIT in queue_rcu_work() is
fine. If nothing else the kernel won't crash if you call queue_rcu_work() twice.

But why flush_rcu_work() can't simply do flush_work() ?

If WORK_STRUCT_PENDING_BIT was set by us (rcu_work_rcufn() succeeded) we do not
need rcu_barrier(). Why should we care about other rcu callbacks?

If rcu_work_rcufn() fails and someone else sets PENDING, how this rcu_barrier()
can help? We didn't even do call_rcu() in this case.

In short. Once flush_work() returns we know that rcu callback which queued this
work is finished. It doesn't matter if it was fired by us or not. And if it was
not fired by us, then rcu_barrier() doesn't imply a grace period anyway.


> We can try to
> make it more specialized but then flush_rcu_work()'s behavior would
> have to different too and it gets confusing quickly. Unless there are
> overriding reasons to deviate, I'd like to keep it consistent.

Perhaps this all is consistent, but I fail to understand this API :/

And again, at least for fs/aio.c it doesn't offer too much but sub-optimal
compared to call_rcu() + schedule_work() by hand.

Oleg.