2.6.34 Northbridge Chipset Errors on HP Proliant 4 x Opteron in x86_64 mode

From: Jeffrey Merkey
Date: Tue Jun 29 2010 - 17:13:12 EST


On a 4 x Opteron HP Proliant Server with a CCISS array controller in
x86_64 mode, under very heavy (saturated) disk IO, 2.6.34 reports the
following error:

Jun 29 02:02:08 kernel: Northbridge Error, node 0, core: 0
Jun 29 02:02:08 kernel: ECC/ChipKill ECC error.
Jun 29 02:02:08 kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0xc7358280
Jun 29 02:02:08 kernel: EDAC amd64: get_channel_from_ecc_syndrome:
error reading F3x180.
Jun 29 02:02:08 kernel: EDAC MC0: CE page 0xc7358, offset 0x280,
grain 0, syndrome 0xa4c1, row 3, channel 0, label "": amd64_edac
Jun 29 02:03:21 kernel: Northbridge Error, node 0
Jun 29 02:03:21 kernel: ECC/ChipKill ECC error.
Jun 29 02:03:21 kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0xc7358280
Jun 29 02:03:21 kernel: EDAC amd64: get_channel_from_ecc_syndrome:
error reading F3x180.
Jun 29 02:03:21 kernel: EDAC MC0: CE page 0xc7358, offset 0x280,
grain 0, syndrome 0xa4c1, row 3, channel 0, label "": amd64_edac

The error is reproduceable by subjecting the server to excessive disk
loads > 350 MB/S stream to disk.

Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/