Re: Why add the general notification queue and its sources

From: Steven Whitehouse
Date: Fri Sep 06 2019 - 12:12:26 EST


Hi,

On 06/09/2019 16:53, Linus Torvalds wrote:
On Fri, Sep 6, 2019 at 8:35 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
This is why I like pipes. You can use them today. They are simple, and
extensible, and you don't need to come up with a new subsystem and
some untested ad-hoc thing that nobody has actually used.
The only _real_ complexity is to make sure that events are reliably parseable.

That's where you really want to use the Linux-only "packet pipe"
thing, becasue otherwise you have to have size markers or other things
to delineate events. But if you do that, then it really becomes
trivial.

And I checked, we made it available to user space, even if the
original reason for that code was kernel-only autofs use: you just
need to make the pipe be O_DIRECT.

This overly stupid program shows off the feature:

#define _GNU_SOURCE
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char **argv)
{
int fd[2];
char buf[10];

pipe2(fd, O_DIRECT | O_NONBLOCK);
write(fd[1], "hello", 5);
write(fd[1], "hi", 2);
read(fd[0], buf, sizeof(buf));
read(fd[0], buf, sizeof(buf));
return 0;
}

and it you strace it (because I was too lazy to add error handling or
printing of results), you'll see

write(4, "hello", 5) = 5
write(4, "hi", 2) = 2
read(3, "hello", 10) = 5
read(3, "hi", 10) = 2

note how you got packets of data on the reader side, instead of
getting the traditional "just buffer it as a stream".

So now you can even have multiple readers of the same event pipe, and
packetization is obvious and trivial. Of course, I'm not sure why
you'd want to have multiple readers, and you'd lose _ordering_, but if
all events are independent, this _might_ be a useful thing in a
threaded environment. Maybe.

(Side note: a zero-sized write will not cause a zero-sized packet. It
will just be dropped).

Linus

The events are generally not independent - we would need ordering either implicit in the protocol or explicit in the messages. We also need to know in case messages are dropped too - doesn't need to be anything fancy, just some idea that since we last did a read, there are messages that got lost, most likely due to buffer overrun.

That is why the initial idea was to use netlink, since it solves a lot of those issues. The downside was that the indirect nature of the netlink sockets resulted in making it tricky to know the namespace of the process to which the message was to be delivered (and hence whether it should be delivered at all),

Steve.