Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x
From: Kyle Moffett
Date: Thu Apr 13 2006 - 06:28:25 EST
On Apr 13, 2006, at 05:56:08, Andrew Morton wrote:
Dan Bonachea <bonachead@xxxxxxxxxxx> wrote:
This problem arose in the parallel runtime system for a scientific
language compiler (nearly a million lines of code total -
definitely a "real-world" program) - the example code is merely a
pared-down demonstration of the problem. In parallel scientific
computing, it's very common for many threads to be writing to
stdout (usually for monitoring purposes) and it's expected and
normal for output from separate threads to be arbitrarily
interleaved, but it's *not* ok for output to be lost entirely.
This is essentially equivalent to the real-world example you gave
of many threads logging to a file.
Interesting - afaik that's the first time this has been hit in a
real application.
I would guess that it could also be a problem with a wide variety of
Perl CGIs running under Apache2+mod_perl with the worker threading
model. In fact this may actually explain some odd behavior I got
from one such module that I "fixed" by switching to logging via
syslog. I don't remember which module it was or what the exact
problem was, but I think it seemed similar in nature. That is
somewhat of a surprising and nonintuitive failure mode for logging,
IMHO it would be nice to get fixed. I would imagine that this has
probably been hit a number of times before, mysteriously fixed by
changed userspace locking or subtle thread ordering, and written off
as the mysterious effects of cosmic rays on RAM.
Cheers,
Kyle Moffett
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/