Re: x86_64: movdqu rarely stores bad data (movdqu works fine). Kernel bug, fried CPU or glibc bug?

From: Sergei Trofimovich
Date: Sun Jun 17 2018 - 17:40:27 EST


On Sat, 16 Jun 2018 22:22:50 +0100
Sergei Trofimovich <slyich@xxxxxxxxx> wrote:

> TL;DR: on master string/test-memmove glibc test fails on my machine
> and I don't know why. Other tests work fine.
> ...
> This fails:
> loop {
> movdqu [src++],%xmm0
> movntdq %xmm0,[dst++]
> }
> sfence
> This works:
> loop {
> movdqu [src++],%xmm0
> movdqu %xmm0,[dst++]
> }
> sfence
> ...
> If there is no obvious problems with glibc's memove() or my small test
> what can I do to rule-out/pin-down hardware or kernel problem?

Found the cause: bad RAM module.

After I've tweaked test to allocate most of available physical RAM
I've got fully reproducible failure.

I unplugged RAM modules one by one and ran the test. That way I've
nailed down to one bad chip. Removing single bad chip restored
string/test-memmove test on this machine \o/

Sorry for the noise!

--

Sergei

Attachment: pgpJ2kvgkT1fU.pgp
Description: Цифровая подпись OpenPGP