Re: [RESEND PATCH v7 1/8] kernfs: Introduce interface to access global kernfs_open_file_mutex.

From: Al Viro
Date: Tue Apr 05 2022 - 17:04:39 EST


On Tue, Apr 05, 2022 at 03:36:03PM +1000, Imran Khan wrote:
> Hello Al,
>
> On 18/3/22 8:34 am, Al Viro wrote:
> > On Thu, Mar 17, 2022 at 06:26:05PM +1100, Imran Khan wrote:
> >
> >> @@ -570,9 +571,10 @@ static void kernfs_put_open_node(struct kernfs_node *kn,
> >> struct kernfs_open_file *of)
> [...]
>
> > As the matter of fact, we can do even better - make freeing
> > that thing rcu-delayed, use rcu_assign_pointer() for stores,
> > rcu_dereference() for loads and have kernfs_notify() do
> > rcu_read_lock();
> > on = rcu_dereference(kn->attr.open);
> > if (on) {
> > atomic_inc(&on->event);
> > wake_up_interruptible(&on->poll);
> > }
> > rcu_read_unlock();
> > and kernfs_open_node_lock becomes useless - all places that
> > grab it are under kernfs_open_file_mutex.
>
> There are some issues in freeing ->attr.open under RCU callback.

Such as?

> There
> are some users of ->attr.open that can block and hence can't operate
> under rcu_read_lock. For example in kernfs_drain_open_files we are
> traversing list of open files corresponding to ->attr.open and unmapping
> those as well. The unmapping operation can block in i_mmap_lock_write.

Yes.

> So even after removing refcnt we will still need kernfs_open_node_lock.

What for? Again, have kernfs_drain_open_files() do this:
{
struct kernfs_open_node *on;
struct kernfs_open_file *of;

if (!(kn->flags & (KERNFS_HAS_MMAP | KERNFS_HAS_RELEASE)))
return;
if (rcu_dereference(kn->attr.open) == NULL)
return;
mutex_lock(&kernfs_open_file_mutex);
// now ->attr.open is stable (all stores are under kernfs_open_file_mutex)
on = rcu_dereference(kn->attr.open);
if (!on) {
mutex_unlock(&kernfs_open_file_mutex);
return;
}
// on->files contents is stable
list_for_each_entry(of, &on->files, list) {
struct inode *inode = file_inode(of->file);

if (kn->flags & KERNFS_HAS_MMAP)
unmap_mapping_range(inode->i_mapping, 0, 0, 1);

if (kn->flags & KERNFS_HAS_RELEASE)
kernfs_release_file(kn, of);
}
mutex_unlock(&kernfs_open_file_mutex);
}

What's the problem? The caller has already guaranteed that no additions will
happen. Once we'd grabbed kernfs_open_file_mutex, we know that
* kn->attr.open value won't change until we drop the mutex
* nothing gets removed from kn->attr.open->files until we drop the mutex
so we can bloody well walk that list, blocking as much as we want.

We don't need rcu_read_lock() there - we are already holding the mutex used
by writers for exclusion among themselves. RCU *allows* lockless readers,
it doesn't require all readers to be such. kernfs_notify() can be made
lockless, this one can't and that's fine.

BTW, speaking of kernfs_notify() - can calls of that come from NMI handlers?
If not, I'd consider using llist for kernfs_notify_list...