Re: Fix for thread+network crashes in 2.0/2.1?

Linus Torvalds (torvalds@transmeta.com)
Fri, 27 Feb 1998 08:35:32 -0800 (PST)


On Fri, 27 Feb 1998, Joshua M. Thompson wrote:
>
> On Thu, 26 Feb 1998, Linus Torvalds wrote:
> > Anyway, I'm releasing a 2.1.89-3 on ftp.kernel.org under the "testing"
> > subdirectory, and I'd be very happy if people would test it. I don't
> > guarantee that this patch works at all - for all I know it might be
> > totally broken, and the only thing I guarantee is that (a) it compiles
> > with my particular setup and (b) it looks like it should work and makes
> > sense.
> >
> > Please do test, and comment,
>
> I just compiled and booted 2.1.89. Had to make a slight change in
> net/netlink/netlink_dev.c; there is a missing parameter to the poll()
> operation on line 45.

Ok, thanks, I missed that somehow (I'm not compiling netlink myself, but I
did grep for poll to try to find it in files I didn't use).

> Anyway...I am repeatedly running the crash program with it set to spawn
> 100 of each time of thread (for a total of 300), and except for the
> expected slowdown of the machine it seems to be rock solid. Previously I
> was always guaranteed a lockup (99% of the time) or an oops (1% of the
> time); now I don't see either problem.

Yah! Thanks for the report, this makes me feel extremely good about my
self.

> If I want to backport this patch to 2.0.34pre2, what parts of this patch
> will I need? I see there's a one-line change to select.c...is that the
> whole of the patch? :) Once I get it backported I will turn on the Typhoon
> news server again at the office and see what happens...

It will be awkward but not hard to back-port to 2.0.x. The awkwardness
comes mainly from the fact that the poll functions were still called
"select()" in 2.0.x, so you essentially have to do everything by hand.

The best way to see the changes is probably to cut the patches down a bit:
the easiest way to do that is to get 2.1.89-2 and 2.1.89-3 and do a diff
between the two. That will still have some other stuff too, but at least
you should get a lot less to look at.

The only real changes wrt select are:
- linux/poll.h (it's in "linux/wait.h" in 2.0.x): add the "struct file"
pointer to the poll_table_entry (select_table_entry in 2.0), and add
the initialization and the f_count update to poll_wait() (which is in
linux/sched.h and called "select_wait()" in 2.0).
- add the "struct file" pointer to the VFS poll/select function in
linux/fs.h, and update all callers.
- add the "fput()" to "free_wait()" in fs/select.c

It should take maybe half an hour to do and test, but as I'm not touching
the 2.0 base any more I hope somebody else will do this.

Thanks for the report,

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu