Re: [patch V3 00/66] x86/fpu: Spring cleaning and PKRU sanitizing
From: Dey, Megha
Date:  Tue Jun 22 2021 - 14:56:34 EST
Hi Thomas,
On 6/18/2021 7:48 PM, Thomas Gleixner wrote:
The main parts of this series are:
   - Yet more bug fixes
   - Simplification and removal/replacement of redundant and/or
     overengineered code.
   - Name space cleanup as the existing names were just a permanent source
     of confusion.
   - Clear seperation of user ABI and kernel internal state handling.
   - Removal of PKRU from being XSTATE managed in the kernel because PKRU
     has to be eagerly restored on context switch and keeping it in sync
     in the xstate buffer is just pointless overhead and fragile.
     The kernel still XSAVEs PKRU on context switch but the value in the
     buffer is not longer used and never restored from the buffer.
     This still needs to be cleaned up, but the series is already 40+
     patches large and the cleanup of this is not a functional problem.
     The functional issues of PKRU management are fully addressed with the
     series as is.
   - Cleanup of fpu signal restore
     - Make the fast path self contained. Handle #PF directly and skip
       the slow path on any other exception as that will just end up
       with the same result that the frame is invalid. This allows
       the compiler to optimize the slow path out for 64bit kernels
       w/o ia32 emulation.
     - Reduce code duplication and unnecessary operations
       
It applies on top of
   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
and is also available via git:
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu
This is a follow up to V2 which can be found here:
      https://lore.kernel.org/r/20210614154408.673478623@xxxxxxxxxxxxx
I tested the x86/fpu branch using both AVX2(intree) and AVX512(out of 
tree) crypto code.
I used the tcrypt test module in the kernel and ran the prime95 workload 
(which finds prime numbers using AVX2 or AVX512) as a background process 
to make sure that the AVX states don't get screwed up after we run the 
crypto in kernel.
I did not see any issues with this branch and all tcrypt tests run as 
expected. I tested using the SHA1/256, AES-CTR, AES-GCM, camelia and 
crc32t10diff crypto algorithms.
Thanks,
Megha
Changes vs. V2:
   - Fixed the testing fallout (Dave, Kan)
   - Fixed a few issues found by myself when going through the lot
     with a fine comb, especially MXCSR handling
   - Drop the FNSAVE optimizations
   - Cleanup of signal restore
   - Addressed review comments, mostly comments and a hopefully better
     naming scheme which now just uses the instruction names and
     consolidates everything else on save/restore so it's close to the way
     how the hardware works.
   - A few cleanups and simplifications on the way (mostly regset related).
   - Picked up tags
With the above I'm not intending to do any further surgery on that
code at the moment, though there is still room for improvement which
can and has to be worked on when new bits are added.
Thanks,
	tglx
---
  arch/x86/events/intel/lbr.c          |    6
  arch/x86/include/asm/fpu/internal.h  |  211 +++-------
  arch/x86/include/asm/fpu/xstate.h    |   70 ++-
  arch/x86/include/asm/pgtable.h       |   57 --
  arch/x86/include/asm/pkeys.h         |    9
  arch/x86/include/asm/pkru.h          |   62 +++
  arch/x86/include/asm/processor.h     |    9
  arch/x86/include/asm/special_insns.h |   14
  arch/x86/kernel/cpu/common.c         |   34 -
  arch/x86/kernel/fpu/core.c           |  276 +++++++------
  arch/x86/kernel/fpu/init.c           |   15
  arch/x86/kernel/fpu/regset.c         |  220 ++++++-----
  arch/x86/kernel/fpu/signal.c         |  423 +++++++++------------
  arch/x86/kernel/fpu/xstate.c         |  693 ++++++++++++++---------------------
  arch/x86/kernel/process.c            |   22 -
  arch/x86/kernel/process_64.c         |   28 +
  arch/x86/kernel/traps.c              |    5
  arch/x86/kvm/svm/sev.c               |    1
  arch/x86/kvm/x86.c                   |   56 +-
  arch/x86/mm/extable.c                |    2
  arch/x86/mm/fault.c                  |    2
  arch/x86/mm/pkeys.c                  |   22 -
  include/linux/pkeys.h                |    4
  23 files changed, 1060 insertions(+), 1181 deletions(-)