Re: sk->data_ready at socket release

David S. Miller (davem@jenolan.rutgers.edu)
Tue, 16 Sep 1997 11:26:07 -0400


Date: Tue, 16 Sep 1997 10:55:28 -0400
From: Bill Hawes <whawes@star.net>

On the subject of socket release problems, while debugging smbfs I
had a very repeatable oops that looked like a sock structure being
released while someone was in the sk->data_ready callback. The
oops occurred right after closing a socket filehandle.

I'm not familiar with the data_ready callback code, so my question
is whether the socket and sock release code is supposed to handle
the case of a waiting for a caller in data_ready to return.

If it _is_ supposed to handle this, I can try to revert some code
to make a repeatable test case. (The problem went away when I
fixed a different bug.)

It is, but this sounds like a bug in the smbfs stuff actually.

Are you passing references to sockets around to various threads
without bumping reference counts to the descriptors?

When the socket layer sees the real close() code path, it assumes (as
it should) that the last person holding a ref count is closing a
descriptor. The only thing that can delay the full release is pending
data in the sock which hasn't made it to the wire yet, and those will
make the close()'r sleep until the packets go or things time out (or
the socket gets reset in the case of TCP, since you can't unload the
send buffers and just have to free them all).

So what it sounds like is that you aren't keeping the descriptor ref
counts straight in the sockets that smbfs is using. Ie. you just pass
a reference, that guy waits on data, someone else does a close and as
far as the vfs is concerned this is the final close since the ref
count on the fd representing the sleeper was never accounted for.

Later,
David "Sparc" Miller
davem@caip.rutgers.edu