NFS root hangs

From: Jeremy Sanders
Date: Tue Nov 11 2008 - 09:15:21 EST


Hi - We're running several diskless x86-64 systems with a MS-6702
motherboard with an Athlon 64 3400+ CPU. The network driver is an onboard
R8169 (rev 10) using gigabit ethernet. The kernel is 2.6.26.6-49.fc8 (i.e.
the latest Fedora 8 kernel). The root partition is mounted over NFS (v3) in
an initrd init script.

This setup works fine for some dual core Athlon 64s and single core x86
Pentium 4s. However the diskless single core Athlon 64 systems lock up
randomly after minutes or tens of minutes of idleness.

The do not print any oops debugging information. They are not pingable. They
do not respond to alt+sysctl commands. They also lock up with an active CPU
as they generate a lot of heat while crashed. We've also tried noapic,
noacpi boot options. We've also tried to enable nmi_watchdog=2, which
doesn't give any debugging information either. We've also tried adjusting
the various NFS mount options with no effect (rsize/wsize/nfsvers/udp/tcp).
In x86 mode the systems also hang, but take a lot longer to do it.

We suspect the R8169 is at fault as our other systems work. The systems used
to work, but something has started this problem. Old kernels do not fix the
problem.

Does anyone have any ideas or tips on how to debug this problem?

Thanks

Jeremy


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/