Re: [PATCH 1/3] perf, x86: Add new cache events table for Haswell

From: Ingo Molnar
Date: Mon Mar 23 2015 - 09:55:35 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Mon, Mar 23, 2015 at 10:45:07AM +0100, Ingo Molnar wrote:
> >
> > * Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> >
> > > From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> > >
> > > Haswell offcore events are quite different from Sandy Bridge.
> > > Add a new table to handle Haswell properly.
> > >
> > > Note that the offcore bits listed in the SDM are not quite correct
> > > (this is currently being fixed). An uptodate list of bits is
> > > in the patch.
> > >
> > > The basic setup is similar to Sandy Bridge. The prefetch columns
> > > have been removed, as prefetch counting is not very reliable
> > > on Haswell. One L1 event that is not in the event list anymore
> > > has been also removed.
> > >
> > > - data reads do not include code reads (comparable to earlier Sandy
> > > Bridge tables)
> > > - data counts include speculative execution (except L1 write, dtlb, bpu)
> > > - remote node access includes both remote memory, remote cache, remote mmio.
> > > - prefetches are not included in the counts for consistency
> > > (different from Sandy Bridge, which includes prefetches in the remote node)
> > >
> > > The events with additional caveats have references to the specification update.
> >
> > > + [ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_UOPS_RETIRED.ALL_LOADS, HSM30 */
> > > + [ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_UOPS_RETIRED.ALL_STORES, HSM30 */
> > > + [ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_UOPS_RETIRED.ALL_LOADS, HSM30 */
> > > + [ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_UOPS_RETIRED.ALL_STORES, HSM30 */
> >
> > So that 'HSM30' is code for the specification update?
>
> Yep; found it in:
> http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-mobile-specification-update.pdf
>
> > You'll need to properly describe HSM30 at least once instead of using
> > obfuscation.
>
>
> HSM30.
> Problem:
> Performance Monitor Counters May Produce Incorrect Results
> When operating with SMT enabled, a memory at-retirement performance monitoring
> event (from the list below) may be dropped or may increment an enabled event on the
> corresponding counter with the same number on the physical coreâs other thread rather
> than the thread experiencing the event. Processors with SMT disabled in BIOS are not
> affected by this erratum.
> The list of affected memory at-retirement events is as follows:
> MEM_UOP_RETIRED.LOADS
> MEM_UOP_RETIRED.STORES
> MEM_UOP_RETIRED.LOCK
> MEM_UOP_RETIRED.SPLIT
> MEM_UOP_RETIRED.STLB_MISS
> MEM_LOAD_UOPS_RETIRED.HIT_LFB
> MEM_LOAD_UOPS_RETIRED.L1_HIT
> MEM_LOAD_UOPS_RETIRED.L2_HIT
> MEM_LOAD_UOPS_RETIRED.L3_HIT
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE
> MEM_LOAD_UOPS_RETIRED.L3_MISS
> MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM
> MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM
> MEM_LOAD_UOPS_RETIRED.L2_MISS
> Implication:
> Due to this erratum, certain performance monitoring event will produce unreliable
> results during hyper-threaded operation.
> Workaround:
> None identified.
> Status:
> For the steppings affected, see the Summary Table of Changes.
>
> Stephane is working on patches to address this. It affects multiple
> generations.

Ok - then at minimum a minimal summary of 'HSM30' should be added to
the code, so that people know what's happening, without having to dig
out Intel documents (which might be gone years down the line).

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/