Re: UDP recvmsg blocks after select(), 2.6 bug?

From: H. Peter Anvin
Date: Thu Oct 21 2004 - 01:07:44 EST


Chris Friesen wrote:
H. Peter Anvin wrote:

H. Peter Anvin wrote:

The whole point is that it doesn't break the *documented* interface.


I'm talking about returning -1, EIO.



Ah. By "it", I thought you meant the current performance optimizations, not the EIO. My apologies.

I think returning EIO is suboptimal, as it is not an expected error value for recvmsg(). (It's not listed in the man pages for recvmsg() or ip.) It would certainly work for new apps, but probably not for many existing binaries.

POSIX specifies:

The recvmsg( ) function shall fail if:

[EAGAIN] or [EWOULDBLOCK] The socket's file descriptor is marked O_NONBLOCK and no data is waiting to be received; or MSG_OOB is set and no out-of-band data is available and either the socket s file descriptor is marked O_NONBLOCK or the socket does not support blocking to await out-of-band data.

[EBADF] The socket argument is not a valid open file descriptor.

[ECONNRESET] A connection was forcibly closed by a peer.

[EINTR] This function was interrupted by a signal before any data was available.

[EINVAL] The sum of the iov_len values overflows a ssize_t, or the MSG_OOB flag is set 37371 and no out-of-band data is available.

[EMSGSIZE] The msg_iovlen member of the msghdr structure pointed to by message is less 37373 than or equal to 0, or is greater than {IOV_MAX}.

[ENOTCONN] A receive is attempted on a connection-mode socket that is not connected.

[ENOTSOCK] The socket argument does not refer to a socket.

[EOPNOTSUPP] The specified flags are not supported for this socket type.

[ETIMEDOUT] The connection timed out during connection establishment, or due to a transmission timeout on active connection.

The recvmsg( ) function may fail if:

[EIO] An I/O error occurred while reading from or writing to the file system.

[ENOBUFS] Insufficient resources were available in the system to perform the operation.

[ENOMEM] Insufficient memory was available to fulfill the request.

Since you didn't code to Linux, and didn't code to POSIX... what did you code to?

On the other hand, if you simply do the checksum verification at select() time for blocking sockets, then the existing binaries get exactly the behaviour they expect.


... unless the blocking changes. In which case you either have to do work twice, or it might *never* happen. Not to mention the extra code complexity.

The performance overhead of checksumming is substantial; I have seen some real horror examples of what happens when you do it badly.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/