Re: [PATCH] v9fs: handle async processing of F_SETLK with FL_SLEEP flag

From: Vasily Averin
Date: Fri Dec 24 2021 - 05:34:26 EST


On 24.12.2021 10:31, Dominique Martinet wrote:
> Vasily Averin wrote on Fri, Dec 24, 2021 at 10:08:57AM +0300:
>> Answering on you question: it's ok to ignore of FL_SLEEP flag for F_SETLK command,
>
> On the other hand, just clearing the FL_SLEEP flag like you've done for
> 9p will make the server think the lock has been queued when it hasn't
> really been.
> That means the client lock request will hang forever and never be
> granted even when the lock becomes available later on, so unless I
> misunderstood something here I don't think that's a reasonable fallback.

I did not get your this statement. Could you please elaborate it in more details?

Right now nfsd/lockd/ksmbd drop FL_SLEEP on own side, and this looks acceptable for them:
instead of blocking lock they submit non-blocking SETLK and it's enough to avoid server deadlock.

If the lock is already taken: SETLK just return an error and will not wait.
I'm agree it isn't ideal, and perhaps can cause server will return some unexpected errno,
but I do not see how it can make the server think the lock has been queued.

>> It would be even better to use posix_lock_file() instead of locks_lock_file_wait(),
>> but I cannot do it without your assistance.
>
> let's try to fix this properly instead, I'm happy to help.
>
> Basically 9p does things in two steps:
> - first it tries to get the lock locally at the vfs level.
> I'm not familiar with all the locking helpers we have at disposal, but
> as long as the distinction between flock and posix locks is kept I'm
> happy with anything here.
>
> If that process is made asynchronous, we need a way to run more
> 9p-specific code in that one's lm_grant callback, so we can proceed onto
> the second step which is...
>
> - send the lock request to the 9p server and wait for its reply
> (note that the current code is always synchronous here: even if you
> request SETLK without the SLEEP flag you can be made to wait here.
> I have work in the closest to make some requests asynchronous, so
> locking could be made asynchronous when that lands, but my code
> introduced a race somewhere I haven't had the time to fix so this
> improvement will come later)
>
> What would you suggest with that?

It seems we can just replace locks_lock_file_wait() call by posix_lock_file()
in described scenario. I'll sent v2 patch version soon.

Thank you,
Vasily Averin