floating point computation error caused by eagerfpu

From: Lei Chen
Date: Fri Mar 23 2018 - 03:23:49 EST

Next message: Quentin Schulz: "Re: [PATCH 2/2] mmc: Add mmc_force_detect_change_begin / _end functions"
Previous message: Stephane Eranian: "[PATCH] perf/x86/intel: fix linear IP of PEBS real_ip"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,
I'm trying to figure out the root cause of a floating point
calculation error on kernel 4.4.98. My coworker runs a SHA1 test tool.
The generated sha1 does not match the expected value. Strangely, this
test just goes well on one VM. After a lot of comparison between this
VM and the bare metal x86-64 environment, we find the suspicious point
-- the VM uses 'lazy' mode FPU context switch while bare metal server
uses 'eager' mode. Then I rebuilt the kernel with "eagerfpu=DISABLE"
by default. I'm happily to see the test passes across different
platforms(different VMs and different x86 servers).

We don't have any custom FPU setting or modification to the native
Linux 4.4.98 kernel code. Per my understanding, during boot, system
will choose eagerfpu mode automatically according to the CPU's
capability. It should have just worked well if the CPU supports eager
mode. But the test result shows that there might be FPU context
corruption. Having googled around, I don't find similar report. Could
FPU experts shed some light on this issue?

Thanks,
Lei Chen

Next message: Quentin Schulz: "Re: [PATCH 2/2] mmc: Add mmc_force_detect_change_begin / _end functions"
Previous message: Stephane Eranian: "[PATCH] perf/x86/intel: fix linear IP of PEBS real_ip"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]