Re: UDP recvmsg blocks after select(), 2.6 bug?

From: Martijn Sipkema
Date: Sun Oct 17 2004 - 11:49:45 EST


From: "Buddy Lucas" <buddy.lucas@xxxxxxxxx>
> On Sun, 17 Oct 2004 11:05:09 -0400, Mark Mielke <mark@xxxxxxxxxxxxxx> wrote:
> > On Sun, Oct 17, 2004 at 04:17:06PM +0200, Buddy Lucas wrote:
> > > On Sun, 17 Oct 2004 15:35:37 +0200, Lars Marowsky-Bree <lmb@xxxxxxx> wrote:
> > > > The SuV spec is actually quite detailed about the options here:
> > > > A descriptor shall be considered ready for reading when a call
> > > > to an input function with O_NONBLOCK clear would not block,
> > > > whether or not the function would transfer data successfully.
> > > > (The function might return data, an end-of-file indication, or
> > > > an error other than one indicating that it is blocked, and in
> > > > each of these cases the descriptor shall be considered ready for
> > > > reading.)
> > > But it says nowhere that the select()/recvmsg() operation is atomic, right?
> >
> > This is a distraction. If the call to select() had been substituted
> > with a call to recvmsg(), it would have blocked. Instead, select() is
> > returning 'yes, you can read', and then recvmsg() is blocking. The
> > select() lied. The information is all sitting in the kernel packet
>
> No. A million things might happen between select() and recvmsg(), both
> in kernel and application. For a consistent behaviour throughout all
> possibilities, you *have* to assume that any read on a blocking fd may
> block, and that a fd ready for reading at select() time might not be
> readable once the app gets to recvmsg() -- for whatever reason.

It is perfectly possible to not have a million things happen between
select() and recvmsg() and POSIX defines what can happen and what
can't; it states that a process calling select() on a socket will not block
on a subsequent recvmsg() on that socket.

> And indeed, that implies that select() on blocking fds is generally
> not useful if you expect to bypass the blocking through select().
> Personally, I think any application that implements this expectation
> is broken. (If only because you might have to do a second read() or
> recvmsg() which will either result in a crappy select() loop or a
> broken read()/recvmsg() loop).

The way select() is defined in POSIX effectively means that once an
application has done a select() on a socket, the data that caused
select() to return is committed, i.e. it can no longer be dropped and
should be considered received by the application; this has nothing
to do with UDP being unreliable and being unreliable for the sake
of it is not what UDP was meant for.

Whether you think an application that is written to use select() as
defined in POSIX is broken is not really important. The fact remains
that Linux currently implements a select() that is _not_ POSIX
compliant and is so solely for performance reasons. I personally think
correct behaviour is much more important.


--ms

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/