Re: NFS locking and statd

Olaf Kirch (
Mon, 15 Dec 1997 11:18:34 +0100

Hi Bill,

The basic idea of statd is to implement a notification service that tells
the kernel when the NFS peer reboots. This enables the server to discard
all locks held by the client (if the client reboots), and the client to
re-establish all locks with the server (if the server reboots).

The protocol underlying statd is NSM, the network service monitoring
protocol. It would have been more appropriately named notification
protocol, because in fact all it does is send around little messages
saying 'I have just booted'. In order to avoid being fooled by
retransmissions of these messages, NSM includes a state number that
is incremented by 2 each time the host reboots.

It works like this:

* When NFS client C tries to lock a file on server S, it first
arranges for statd to 'monitor' the server host. Statd will
add the server's address to its 'notifcation list' and commit
it to stable storage. This is done via the SM_MON call, which
also passes along some callback information (see below).
It consists basically of the client's address and NSM state,
and lockd's RPC program number and the RPC procedure number of
a special callback routine (non-standard of course).
Statd will keep this information in memory.

* The lockd client then sends the lock request to the server.
The server lockd in turn will arrange for its statd to monitor
the client host. Again, this is done via SM_MON and includes
callback information.

Case A: Client reboots.

* rpc.statd starts up. It will walk through its notification list
and send a message to each host's statd telling the host that
it just rebooted. The list is cleared afterwards.

* When receiving the notifcation, statd will use the callback
information and to call lockd. Lockd in turn will firugre out
which client rebooted and discard all locks it held.

Case B: Server reboots.

* Server rpc.lockd starts up. It will enter a so-called grace
period during which it will allow clients to re-establish
previously held locks, but no new locks.

* Server rpc.statd starts up. Just like above, it will walk the
notification list and send a SM_NOTIFY message to each listed
host, including the NFS client. The list is cleared afterwards.

* The client statd receives the notification that the server
rebooted, and use the callback information provided by lockd
earlier to notify lockd that the server rebooted.

* Lockd will go through its list of all locks and figure out
which were the ones it held on the server just having come
back up. For each one, it will send a new LOCK request to
the server, setting the `reclaim' flag that indicates that
this is not a new lock, but an attempt to re-establish an old

* Hopefully everything's gone well so far, and all clients were
able to reclaim all their locks. After about 2 minutes, the
server lockd will end the grace period, and start processing
regular lock requests again.

This is just the rough outline. At last year's Linux Expo, Jeff gave
a much more detailed talk; I think you can find his paper somewhere

This is also the traditional way of lockd/statd interaction. For a
while, I've been thinking if it wouldn't be easier to integrate the
establishment of statd information into the mount procedure.

* When NFS-mounting a volume, the mount command arranges for
statd to monitor the server host, registering the appropriate
lockd procedure in the callback info. The mount data passed
into the kernel includes the NSM status (needed for lockd calls).

* When the server rpc.mountd receives a mount request, it arranges
for statd to monitor the NFS client.

This would get rid of the ugly kernel RPC upcalls.

> Also, any chance you'll have some time to update the linux-nfs package
> in the near future? Knfsd is now working reasonably well, and it would
> be nice to get the support package ready in time for 2.2. (And maybe
> Jeff Uphoff be persuaded to contribute a man page for statd?)

I'll see what I can do. I had also planned to release some of my
NFSv3 patches before christmas....

As for a statd manpage, you should ask Jeff himself.


Olaf Kirch         |  --- o --- Nous sommes du soleil we love when we play  |    / | \   sol.dhoop.naytheet.ah    +-------------------- Why Not?! -----------------------