Re: fsync on large files

Alan Curry (pacman-kernel@cqc.com)
Sun, 14 Feb 1999 01:13:02 -0500 (EST)


I wrote:
>> I know that, and I've already done that as a temporary solution. But syslogd
>> syncs every line written for a good reason, namely that if the machine
>> crashes you don't want to lose the last few lines that were logged. They are
>> the most likely place to look for suspicious happenings. So I want a proper
>> fix, not a "disable the safety feature" kludge.

Simon Kirby responded:
>Hmm...I have yet to find the fsync()ing useful...I agree that it
>could help log something helpful, but I don't see disabling fsync() as a
>kludge, though. Perhaps you should try remote logging and disabling

Someone wiser than me puts a safety feature into a critical daemon, and I
take it out because it's inconvenient? That's a kludge.

>fsync() on the remote machine? If the logging server goes down for
>whatever reason, you can enable fsync() again if you want to see if you
>can get a better trace of it (it could just be spat to the console
>anyhow), but if any other server goes down it should be logged there.

Here's the really weird part: we already have a secondary logging machine,
which is by all measurements less powerful than the main machine, but it
handles all the same log data, with the same syslogd configuration, with
hardly any cpu load at all.

So here's what I think the real problem is. On the main machine, there are
lots of things trying to talk to syslogd. But syslogd can't accept their
connections because it is contantly fsync'ing, so the other daemons block
trying to connect. (I have seen solid evidence of this much: ps lax showing
at least 40 processes with WCHAN=unix_connect.) Then when an fsync finishes,
syslogd accepts another connection, and 40 processes suddenly become
runnable, causing the scheduler to freak out and eat a bunch of cpu time that
does not show up as used by any particular process. On the other machine,
syslogd is the only process that ever runs, so there isn't a scheduling
problem. The fsync'ing is not a killer by itself.

I think I'll just start trusting the log server to do its job, and leave the
syncing turned off on the main box. Anyone who doesn't have a dedicated log
server, I guess the lesson is rotate your logs before they get too big.

-- 
Alan Curry

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/