Re: Serious bug in recent Linux kernels

Guest section DW (dwguest@win.tue.nl)
Sun, 10 Jan 1999 08:27:57 +0100 (MET)


From owner-linux-kernel-outgoing@vger.rutgers.edu Sat Jan 9 10:32:53 1999

There is a serious bug in recent kernels with the timeout of the poll()
system call. The calculation to turn the timeout value from milliseconds
into whatever schedule_timeout() expects is wrong. The following happened
when I read my mail with mutt. I use a development glibc (2.1.106) which
uses poll() when it is available in the kernel. As soon as I opened read a
mail message my screen got spammed with hundreds of lines saying:

schedule_timeout: wrong timeout value f3333335 from 00024c0e

I looked up where address 00024c0e was, it turned out to be do_poll() in
fs/select.c
I decided to strace mutt, to see what it did, and it did a series of poll()
syscalls like this:

poll([{fd=0, events=POLLIN}], 1, 2147483647) = 0

That value is INT_MAX, and should be valid.

Hmm - yesterday I complained that you should have been more precise
with this `recent kernels', but I happened to come along this code in
select.c for some other reason. It must not produce a negative timeout.
The change below makes sure of that.

--- ../../../linux-2.2.0pre6/linux/fs/select.c Sun Nov 22 19:08:50 1998
+++ select.c Sun Jan 10 08:15:04 1999
@@ -331,10 +331,10 @@
if (nfds > NR_OPEN)
goto out;

+ if (timeout > 0)
+ timeout = (timeout*HZ+999)/1000+1;
if (timeout < 0)
timeout = MAX_SCHEDULE_TIMEOUT;
- else if (timeout)
- timeout = (timeout*HZ+999)/1000+1;

err = -ENOMEM;
if (timeout) {

(pasted from another window, tabs lost).

[the expression (timeout*HZ+999)/1000+1 is also a bit peculiar,
but I have not changed it]

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/