SMP/fsck/softdog

Andrew Daviel (andrew@daviel.org)
Sun, 13 Jun 1999 03:50:30 -0700 (PDT)


I have an ASUS m/b with dual Celerons (currently set to design speed
66/366) with RedHat 6.0, 2.2.5 kernel with softdog configured.

Q1. Is softdog SMP-safe ?

I have had a couple of unexplained hangs, so that on rebooting fsck runs.
I confess I don't entirely understand how SMP handles boot, running shell
scripts etc. It seems to me that both processors are running and their
messages are interleaved in the log, for example I get:
Jun 12 04:16:12 daviel-v rc.sysinit: Mounting local filesystems succeeded
.
Jun 12 04:11:20 daviel-v rc.sysinit: Enabling swap space succeeded
.
Jun 12 04:11:20 daviel-v init: Entering runlevel: 3
Jun 12 04:11:22 daviel-v modprobe: can't locate module lo:0
.
Jun 12 04:11:25 daviel-v kernel: scsi : detected 1 SCSI disk total.
.
Jun 12 04:11:22 daviel-v modprobe: can't locate module lo:5
.
Jun 12 04:11:25 daviel-v kernel: eth0: 3Com 3c905B Cyclone 100baseTx at
0xa800, 00:50:04:38:36:60, IRQ 10
.
Jun 12 04:11:22 daviel-v modprobe: can't locate module lo:23
.
Jun 12 04:11:22 daviel-v network: Bringing up interface lo succeeded
Jun 12 04:11:22 daviel-v modprobe: can't locate module eth0:0
.
Jun 12 04:16:24 daviel-v xntpd[468]: xntpd 3-5.93e Wed Apr 14 20:23:29 EDT
1999 (1)
.
Jun 12 04:11:23 daviel-v modprobe: can't locate module eth0:8
.
Jun 12 04:16:24 daviel-v xntpd: xntpd startup succeeded
.
Jun 12 04:11:23 daviel-v modprobe: can't locate module eth0:49
Jun 12 04:11:23 daviel-v network: Bringing up interface eth0 succeeded
.

I have messages interleaved with different timestamp streams, which I
conjectured are from different processors. When fsck ran, I got several
messages about could not locate lo0 and eth0; I wondered if one processor
was trying to load the modules while the other one was still running fsck.
Is this true, and if so am I supposed to have configured something in the
sysinit for SMP?

I have softdog built in the kernel (as suggested in softdog-1.2/README)
and /sbin/softdog started in rc.d/rc.local (after all fsck). On a couple
of occasions it seemed as if softdog rebooted the machine while it was
still running fsck, so again I wondered if somehow CPU#2 had started
softdog while CPU# was still busy running fsck, and softdog had then
failed to write the testfile or something an triggered a shutdown.

Andrew Daviel

Andrew Daviel
http://vancouver-webpages.com/andrew
Deniable unless digitally signed.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/