Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's

From: Ingo Molnar
Date: Mon Oct 28 2013 - 12:24:47 EST



* Neil Horman <nhorman@xxxxxxxxxxxxx> wrote:

> Looking at the specific cpu counters we get this:
>
> Base:
> Total time: 0.179 [sec]
>
> Performance counter stats for 'perf bench sched messaging -- bash -c echo 1 > /sys/module/csum_test/parameters/test_fire' (20 runs):
>
> 1571.304618 task-clock # 5.213 CPUs utilized ( +- 0.45% )
> 14,423 context-switches # 0.009 M/sec ( +- 4.28% )
> 2,710 cpu-migrations # 0.002 M/sec ( +- 2.83% )

Hm, for these second round of measurements were you using 'perf stat
-a -C ...'?

The most accurate method of measurement for such single-threaded
workloads is something like:

taskset 0x1 perf stat -a -C 1 --repeat 20 ...

this will bind your workload to CPU#0, and will do PMU measurements
only there - without mixing in other CPUs or workloads.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/