Re: [take23 0/5] kevent: Generic event handling mechanism.

From: Eric Dumazet
Date: Wed Nov 08 2006 - 18:07:52 EST


Davide Libenzi a écrit :
On Wed, 8 Nov 2006, Andrew Morton wrote:

On Wed, 8 Nov 2006 15:51:13 +0100
Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:

[PATCH] eventpoll : In case a fault occurs during copy_to_user(), we should report the count of events that were successfully copied into user space, instead of EFAULT. That would be consistent with behavior of read/write() syscalls for example.

Signed-off-by: Eric Dumazet <dada1@xxxxxxxxxxxxx>



[eventpoll.patch text/plain (424B)]
--- linux/fs/eventpoll.c 2006-11-08 15:37:36.000000000 +0100
+++ linux/fs/eventpoll.c 2006-11-08 15:38:31.000000000 +0100
@@ -1447,7 +1447,7 @@
&events[eventcnt].events) ||
__put_user(epi->event.data,
&events[eventcnt].data))
- return -EFAULT;
+ return eventcnt ? eventcnt : -EFAULT;
if (epi->event.events & EPOLLONESHOT)
epi->event.events &= EP_PRIVATE_BITS;
eventcnt++;

Definitely a better interface, but I wonder if it's too late to change it.

An app which does

if (epoll_wait(...) == -1)
barf(errno);
else
assume_all_events_were_received();

will now do the wrong thing.

otoh, such an applciation basically _has_ to use the epoll_wait()
return value to work out how many events it received, so maybe it's OK...

I don't care about both ways, but sys_poll() does the same thing epoll does right now, so I would not change epoll behaviour.


Sure poll() cannot return a partial count, since its return value is :

On success, a positive number is returned, where the number returned is
the number of structures which have non-zero revents fields (in other
words, those descriptors with events or errors reported).

poll() is non destructive (it doesnt change any state into kernel). Returning EFAULT in case of an error in the very last bit of user area is mandatory.

On the contrary :

epoll_wait() does return a count of transfered events, and update some state in kernel (it consume Edge Trigered events : They can be lost forever if not reported to user)

So epoll_wait() is much more like read(), that also updates file state in kernel (current file position)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/