Re: [PATCH] ocfs2: kill osb->system_file_mutex lock

From: Heming Zhao
Date: Mon Jun 23 2025 - 22:40:33 EST


On 6/24/25 10:17, Tetsuo Handa wrote:
On 2025/06/24 10:33, Heming Zhao wrote:
@@ -112,11 +110,10 @@ struct inode *ocfs2_get_system_file_inode(struct ocfs2_super *osb,
inode = _ocfs2_get_system_file_inode(osb, type, slot);

In my view, the key of commit 43b10a20372d is to avoid calling
_ocfs2_get_system_file_inode() twice, which lead refcnt+1 but no place to
do refcnt-1.

My understanding is that concurrently calling _ocfs2_get_system_file_inode() itself
is OK, for the caller of ocfs2_get_system_file_inode() is responsible for calling
iput().

We have different perspectives on calling _ocfs2_get_system_file_inode().
In the current code logic, _ocfs2_get_system_file_inode() is expected to
be called only once. Subsequent local system inodes will be retrieved from
the cache (via get_local_system_inode()).


The problem commit 43b10a20372d fixed is that there was no mechanism to avoid
concurrently calling

*arr = igrab(inode);

which will result in failing to call iput() for raced references when
ocfs2_release_system_inodes() is called.


/* add one more if putting into array for first time */
- if (arr && inode) {
- *arr = igrab(inode);
- BUG_ON(!*arr);
+ if (inode && arr && !*arr && !cmpxchg(&(*arr), NULL, inode)) {

Bypassing the refcnt+1 here is not a good idea. We should do refcnt+1
before returning to the caller.

+ inode = igrab(inode);

We do refcnt+1 immediately after cmpxchg() succeeds, for
ocfs2_release_system_inodes() which clears *arr is the place for
doing refcnt-1.

+ BUG_ON(!inode);
}
- mutex_unlock(&osb->system_file_mutex);
return inode;
}



In my view, your patch has logical errors - at least from my perspective,
I have to vote NAK.

In my view, for this syzbot bug, the better solution is to block/deny write
operations during the ocfs2 mounting phase.
There are many syzbot bugs related to writing data during the mounting phase.
I don't believe there is any reason a user would want to write data before the
filesystem is mounted.