Re: Netscape broken with 2.2.0-pre7

Jamie Lokier (lkd@tantalophile.demon.co.uk)
Sun, 17 Jan 1999 17:39:24 +0000


On Sun, Jan 17, 1999 at 05:09:51PM +0100, Andi Kleen wrote:
> > This is when Netscape stops working -- when the pipe fills up.
>
> > Netscape doesn't set the pipe to be non-blocking or anything. It could
> > use those SIGALRM calls to switch to the reading thread or whatever (who
> > knows what goes on in there), but it doesn't.
>
> At least the nspr 3.0 version I am looking at sets the pipe to non
> blocking (in pr/src/md/unix/unix.c:_MD_InitCPUS())

That looks like a good sign. Maybe I will have to upgrade to Mozilla to
read slashdot now ;-)

> A binary search to find the exact version where it broke would be
> very helpful.

I've had enough variety in the reports that it doesn't look there's an
exact version. I suspect it depends on the individual system timings.
I did look for changes in the pipe code between pre4 and pre7, and
found nothing.

> > The bug occurs much less often when I run netscape under `strace -tt -o
> > log', though it still occurs eventually. `strace -o log' is not so
> > effective. The is also some interaction with the X server, because if I
> > don't move the mouse (at all) or type anything after Netscape locks, it
> > recovers after about a second.
>
> Not surprising, strace changes signal timing.

It doesn't look like a signal thing so much as an abuse of a pipe. It's
pretty clear Netscape _could_ use an internal circular buffer instead of
an OS pipe. Then again, maybe that would just add to the many memory
leaks :-)

The microsecond timings I get from strace -tt show that the system is
getting in a lot of work between those 70ms ticks. It's not surprising
it is able to queue a pipe's worth of tokens.

For those investigating, there's another odd behavior. When Netscape
has locked up, send it a SIGTERM (default kill signal). Then the trace
changes completely. It is now waiting for 86400 seconds to pass, in a
select() waiting for no file descriptors. Repeated signals restart that
timeout. What an odd thing for Netscape to do.

There were changes to the timeout calculcations in select() and poll()
recently. Do you think that could have anything to do with the problem?

(I would like to know why the 50ms itimer is ticking every 70ms, too).

-- Jamie

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/