Re: [PATCH v2 2/2] x86/cpufeatures: Enumerate new AVX512 BFLOAT16 instructions

From: Fenghua Yu
Date: Wed Jun 19 2019 - 17:48:50 EST

Next message: Bjorn Helgaas: "Re: [PATCH v2] PCI/P2PDMA: Root complex whitelist should not apply when an IOMMU is present"
Previous message: David Miller: "Re: [PATCH net-next v1] net: stmmac: initialize the reset delay array"
In reply to: Borislav Petkov: "Re: [PATCH v2 2/2] x86/cpufeatures: Enumerate new AVX512 BFLOAT16 instructions"
Next in thread: Borislav Petkov: "Re: [PATCH v2 2/2] x86/cpufeatures: Enumerate new AVX512 BFLOAT16 instructions"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Jun 19, 2019 at 07:31:40PM +0200, Borislav Petkov wrote:
> On Mon, Jun 17, 2019 at 11:00:16AM -0700, Fenghua Yu wrote:
> > AVX512 Vector Neural Network Instructions (VNNI) in Intel Deep Learning
> > Boost support BFLOAT16 format (BF16).
>
> That sentence is a mouthful and I have no clue what it means. Marketing
> junk? If so, either rewrite it for mere mortals or kill it.
>
> > BF16 is a short version of FP32
>
> FP32?
>
> Please write out.
>
> > and has several advantages over FP16.
>
> Ditto.
>
> > BF16 offers more than enough range for
> > deep learning training tasks and doesn't need to handle hardware exception
> > as this is a performance optimization. FP32 accumulation after the
> > multiply is essential to achieve sufficient numerical behavior on an
> > application level.
> >
> > AVX512 BFLOAT16 instructions can be enumerated by:
> > CPUID.7.1:EAX[bit 5] AVX512_BF16
> >
> > Use word 12, which is empty now, to hold features in CPUID.7.1:EAX
> > including AVX512_BF16.
>
> ... because that leaf is features only, right?
>
> > Leaf CPUID_DUMMY is renamed as CPUID_7_1_EAX.
>
> That's obvious from the patch, ain't it?
>

Hi, Boris,

I corrected the commit message per your comment. Except the commit
message, I didn't change anything else.

Now I send the updated patch here. Is the patch right now?
Should I send the patch to you in another thread?

Thanks.

-Fenghua

Next message: Bjorn Helgaas: "Re: [PATCH v2] PCI/P2PDMA: Root complex whitelist should not apply when an IOMMU is present"
Previous message: David Miller: "Re: [PATCH net-next v1] net: stmmac: initialize the reset delay array"
In reply to: Borislav Petkov: "Re: [PATCH v2 2/2] x86/cpufeatures: Enumerate new AVX512 BFLOAT16 instructions"
Next in thread: Borislav Petkov: "Re: [PATCH v2 2/2] x86/cpufeatures: Enumerate new AVX512 BFLOAT16 instructions"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]