Re: [PATCH 1/1] sched/cputime: Mitigate performance regression in times()/clock_gettime()

From: Stanislaw Gruszka
Date: Fri Aug 12 2016 - 08:13:18 EST


Hi

On Wed, Aug 10, 2016 at 01:26:41PM +0200, Ingo Molnar wrote:
> Nice detective work! I'm wondering, where do we stand if compared with a
> pre-6e998916dfe3 kernel?
>
> I admit this is a difficult question: 6e998916dfe3 does not revert cleanly and I
> suspect v3.17 does not run easily on a recent distro. Could you attempt to revert
> the bad effects of 6e998916dfe3 perhaps, just to get numbers - i.e. don't try to
> make the result correct, just see what the performance gap is, roughly.
>
> If there's still a significant gap then it might make sense to optimize this some
> more.

I measured (partial) revert performance on 4.7 using mmtest instructions
from Giovanni and also tested some other possible fix (draft version):

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 75f98c5..54fdf6d 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -294,6 +294,8 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
unsigned int seq, nextseq;
unsigned long flags;

+ (void) task_sched_runtime(tsk);
+
rcu_read_lock();
/* Attempt a lockless read on the first round. */
nextseq = 0;
@@ -308,7 +310,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
task_cputime(t, &utime, &stime);
times->utime += utime;
times->stime += stime;
- times->sum_exec_runtime += task_sched_runtime(t);
+ times->sum_exec_runtime += t->se.sum_exec_runtime;
}
/* If lockless access failed, take the lock. */
nextseq = 1;
---
mmtest benchmark results are below (full compare-kernels.sh output is in attachment):

vanila-4.7 revert prefetch patch
4.74 ( 0.00%) 3.04 ( 35.93%) 4.09 ( 13.81%) 1.30 ( 72.59%)
5.49 ( 0.00%) 5.00 ( 8.97%) 5.34 ( 2.72%) 1.03 ( 81.16%)
6.12 ( 0.00%) 4.91 ( 19.73%) 5.97 ( 2.40%) 0.90 ( 85.27%)
6.68 ( 0.00%) 4.90 ( 26.66%) 6.02 ( 9.75%) 0.88 ( 86.89%)
7.21 ( 0.00%) 5.13 ( 28.85%) 6.70 ( 7.09%) 0.87 ( 87.91%)
7.66 ( 0.00%) 5.22 ( 31.80%) 7.17 ( 6.39%) 0.92 ( 88.01%)
7.91 ( 0.00%) 5.36 ( 32.22%) 7.30 ( 7.72%) 0.95 ( 87.97%)
7.95 ( 0.00%) 5.35 ( 32.73%) 7.34 ( 7.66%) 1.06 ( 86.66%)
8.00 ( 0.00%) 5.33 ( 33.31%) 7.38 ( 7.73%) 1.13 ( 85.82%)
5.61 ( 0.00%) 3.55 ( 36.76%) 4.53 ( 19.23%) 2.29 ( 59.28%)
5.66 ( 0.00%) 4.32 ( 23.79%) 4.75 ( 16.18%) 3.65 ( 35.46%)
5.98 ( 0.00%) 4.97 ( 16.87%) 5.96 ( 0.35%) 3.62 ( 39.40%)
6.58 ( 0.00%) 4.94 ( 24.93%) 6.04 ( 8.32%) 3.63 ( 44.89%)
7.19 ( 0.00%) 5.18 ( 28.01%) 6.68 ( 7.13%) 3.65 ( 49.22%)
7.67 ( 0.00%) 5.27 ( 31.29%) 7.16 ( 6.63%) 3.62 ( 52.76%)
7.88 ( 0.00%) 5.36 ( 31.98%) 7.28 ( 7.58%) 3.65 ( 53.71%)
7.99 ( 0.00%) 5.39 ( 32.52%) 7.40 ( 7.42%) 3.65 ( 54.25%)

Patch works because we we update sum_exec_runtime on current thread
what assure we see proper sum_exec_runtime value on different CPUs. I
tested it with reproducers from commits 6e998916dfe32 and d670ec13178d0,
patch did not break them. I'm going to run some other test.

Patch is draft version for early review, task_sched_runtime() will be
simplified (since it's called only current thread) and possibly split
into two functions: one that call update_curr() and other that return
sum_exec_runtime (assure it's consistent on 32 bit arches).

Stanislaw

poundtime
vanilla rever prefetc mas
4.7 revert prefetch mask
Min real-pound_clock_gettime-2 4.38 ( 0.00%) 2.73 ( 37.67%) 3.62 ( 17.35%) 1.19 ( 72.83%)
Min real-pound_clock_gettime-5 5.40 ( 0.00%) 4.76 ( 11.85%) 4.49 ( 16.85%) 0.99 ( 81.67%)
Min real-pound_clock_gettime-8 5.83 ( 0.00%) 4.88 ( 16.30%) 5.91 ( -1.37%) 0.88 ( 84.91%)
Min real-pound_clock_gettime-12 6.55 ( 0.00%) 4.87 ( 25.65%) 5.98 ( 8.70%) 0.84 ( 87.18%)
Min real-pound_clock_gettime-21 7.11 ( 0.00%) 5.10 ( 28.27%) 6.63 ( 6.75%) 0.85 ( 88.05%)
Min real-pound_clock_gettime-30 7.56 ( 0.00%) 5.20 ( 31.22%) 7.08 ( 6.35%) 0.87 ( 88.49%)
Min real-pound_clock_gettime-48 7.78 ( 0.00%) 5.24 ( 32.65%) 7.20 ( 7.46%) 0.92 ( 88.17%)
Min real-pound_clock_gettime-79 7.89 ( 0.00%) 5.23 ( 33.71%) 7.20 ( 8.75%) 1.00 ( 87.33%)
Min real-pound_clock_gettime-96 7.88 ( 0.00%) 5.24 ( 33.50%) 7.29 ( 7.49%) 1.09 ( 86.17%)
Min real-pound_times-2 4.87 ( 0.00%) 3.19 ( 34.50%) 4.00 ( 17.86%) 2.06 ( 57.70%)
Min real-pound_times-5 5.59 ( 0.00%) 3.91 ( 30.05%) 4.61 ( 17.53%) 3.61 ( 35.42%)
Min real-pound_times-8 5.74 ( 0.00%) 4.88 ( 14.98%) 5.80 ( -1.05%) 3.56 ( 37.98%)
Min real-pound_times-12 6.44 ( 0.00%) 4.90 ( 23.91%) 6.00 ( 6.83%) 3.52 ( 45.34%)
Min real-pound_times-21 7.11 ( 0.00%) 5.11 ( 28.13%) 6.61 ( 7.03%) 3.59 ( 49.51%)
Min real-pound_times-30 7.60 ( 0.00%) 5.20 ( 31.58%) 7.03 ( 7.50%) 3.54 ( 53.42%)
Min real-pound_times-48 7.80 ( 0.00%) 5.24 ( 32.82%) 7.20 ( 7.69%) 3.61 ( 53.72%)
Min real-pound_times-79 7.92 ( 0.00%) 5.24 ( 33.84%) 7.31 ( 7.70%) 3.61 ( 54.42%)
Min real-pound_times-96 7.94 ( 0.00%) 5.24 ( 34.01%) 7.29 ( 8.19%) 3.58 ( 54.91%)
Min syst-pound_clock_gettime-2 8.54 ( 0.00%) 4.89 ( 42.74%) 6.98 ( 18.27%) 2.16 ( 74.71%)
Min syst-pound_clock_gettime-5 26.57 ( 0.00%) 23.29 ( 12.34%) 22.09 ( 16.86%) 4.47 ( 83.18%)
Min syst-pound_clock_gettime-8 45.82 ( 0.00%) 38.02 ( 17.02%) 46.61 ( -1.72%) 6.44 ( 85.95%)
Min syst-pound_clock_gettime-12 77.23 ( 0.00%) 56.61 ( 26.70%) 69.25 ( 10.33%) 9.34 ( 87.91%)
Min syst-pound_clock_gettime-21 147.44 ( 0.00%) 103.97 ( 29.48%) 134.76 ( 8.60%) 15.12 ( 89.74%)
Min syst-pound_clock_gettime-30 176.07 ( 0.00%) 117.81 ( 33.09%) 162.77 ( 7.55%) 15.95 ( 90.94%)
Min syst-pound_clock_gettime-48 182.93 ( 0.00%) 119.92 ( 34.44%) 168.06 ( 8.13%) 19.82 ( 89.17%)
Min syst-pound_clock_gettime-79 186.13 ( 0.00%) 123.31 ( 33.75%) 170.34 ( 8.48%) 22.90 ( 87.70%)
Min syst-pound_clock_gettime-96 187.05 ( 0.00%) 124.22 ( 33.59%) 172.67 ( 7.69%) 25.19 ( 86.53%)
Min syst-pound_times-2 9.55 ( 0.00%) 6.22 ( 34.87%) 7.80 ( 18.32%) 3.90 ( 59.16%)
Min syst-pound_times-5 27.68 ( 0.00%) 19.24 ( 30.49%) 22.76 ( 17.77%) 17.56 ( 36.56%)
Min syst-pound_times-8 45.11 ( 0.00%) 38.75 ( 14.10%) 45.15 ( -0.09%) 27.77 ( 38.44%)
Min syst-pound_times-12 76.60 ( 0.00%) 56.89 ( 25.73%) 71.06 ( 7.23%) 41.64 ( 45.64%)
Min syst-pound_times-21 145.25 ( 0.00%) 102.48 ( 29.45%) 136.15 ( 6.27%) 72.98 ( 49.76%)
Min syst-pound_times-30 175.03 ( 0.00%) 118.89 ( 32.07%) 161.32 ( 7.83%) 79.91 ( 54.34%)
Min syst-pound_times-48 183.61 ( 0.00%) 121.06 ( 34.07%) 167.26 ( 8.90%) 83.24 ( 54.66%)
Min syst-pound_times-79 187.18 ( 0.00%) 123.24 ( 34.16%) 173.22 ( 7.46%) 84.36 ( 54.93%)
Min syst-pound_times-96 188.88 ( 0.00%) 124.04 ( 34.33%) 173.52 ( 8.13%) 83.02 ( 56.05%)
Amean real-pound_clock_gettime-2 4.74 ( 0.00%) 3.04 ( 35.93%) 4.09 ( 13.81%) 1.30 ( 72.59%)
Amean real-pound_clock_gettime-5 5.49 ( 0.00%) 5.00 ( 8.97%) 5.34 ( 2.72%) 1.03 ( 81.16%)
Amean real-pound_clock_gettime-8 6.12 ( 0.00%) 4.91 ( 19.73%) 5.97 ( 2.40%) 0.90 ( 85.27%)
Amean real-pound_clock_gettime-12 6.68 ( 0.00%) 4.90 ( 26.66%) 6.02 ( 9.75%) 0.88 ( 86.89%)
Amean real-pound_clock_gettime-21 7.21 ( 0.00%) 5.13 ( 28.85%) 6.70 ( 7.09%) 0.87 ( 87.91%)
Amean real-pound_clock_gettime-30 7.66 ( 0.00%) 5.22 ( 31.80%) 7.17 ( 6.39%) 0.92 ( 88.01%)
Amean real-pound_clock_gettime-48 7.91 ( 0.00%) 5.36 ( 32.22%) 7.30 ( 7.72%) 0.95 ( 87.97%)
Amean real-pound_clock_gettime-79 7.95 ( 0.00%) 5.35 ( 32.73%) 7.34 ( 7.66%) 1.06 ( 86.66%)
Amean real-pound_clock_gettime-96 8.00 ( 0.00%) 5.33 ( 33.31%) 7.38 ( 7.73%) 1.13 ( 85.82%)
Amean real-pound_times-2 5.61 ( 0.00%) 3.55 ( 36.76%) 4.53 ( 19.23%) 2.29 ( 59.28%)
Amean real-pound_times-5 5.66 ( 0.00%) 4.32 ( 23.79%) 4.75 ( 16.18%) 3.65 ( 35.46%)
Amean real-pound_times-8 5.98 ( 0.00%) 4.97 ( 16.87%) 5.96 ( 0.35%) 3.62 ( 39.40%)
Amean real-pound_times-12 6.58 ( 0.00%) 4.94 ( 24.93%) 6.04 ( 8.32%) 3.63 ( 44.89%)
Amean real-pound_times-21 7.19 ( 0.00%) 5.18 ( 28.01%) 6.68 ( 7.13%) 3.65 ( 49.22%)
Amean real-pound_times-30 7.67 ( 0.00%) 5.27 ( 31.29%) 7.16 ( 6.63%) 3.62 ( 52.76%)
Amean real-pound_times-48 7.88 ( 0.00%) 5.36 ( 31.98%) 7.28 ( 7.58%) 3.65 ( 53.71%)
Amean real-pound_times-79 7.99 ( 0.00%) 5.39 ( 32.52%) 7.40 ( 7.42%) 3.65 ( 54.25%)
Amean real-pound_times-96 8.01 ( 0.00%) 5.35 ( 33.20%) 7.36 ( 8.09%) 3.64 ( 54.49%)
Amean syst-pound_clock_gettime-2 9.22 ( 0.00%) 5.45 ( 40.95%) 7.90 ( 14.32%) 2.34 ( 74.66%)
Amean syst-pound_clock_gettime-5 27.03 ( 0.00%) 24.21 ( 10.40%) 26.24 ( 2.90%) 4.73 ( 82.48%)
Amean syst-pound_clock_gettime-8 48.33 ( 0.00%) 38.40 ( 20.55%) 47.11 ( 2.52%) 6.64 ( 86.25%)
Amean syst-pound_clock_gettime-12 78.93 ( 0.00%) 57.30 ( 27.41%) 71.04 ( 10.00%) 9.69 ( 87.72%)
Amean syst-pound_clock_gettime-21 149.27 ( 0.00%) 105.34 ( 29.43%) 138.19 ( 7.42%) 16.50 ( 88.95%)
Amean syst-pound_clock_gettime-30 178.36 ( 0.00%) 119.83 ( 32.82%) 166.75 ( 6.51%) 18.67 ( 89.53%)
Amean syst-pound_clock_gettime-48 185.77 ( 0.00%) 124.80 ( 32.82%) 171.14 ( 7.88%) 21.12 ( 88.63%)
Amean syst-pound_clock_gettime-79 188.17 ( 0.00%) 126.34 ( 32.86%) 173.99 ( 7.53%) 24.07 ( 87.21%)
Amean syst-pound_clock_gettime-96 190.24 ( 0.00%) 126.63 ( 33.44%) 175.32 ( 7.84%) 26.12 ( 86.27%)
Amean syst-pound_times-2 11.02 ( 0.00%) 6.91 ( 37.27%) 8.85 ( 19.68%) 4.36 ( 60.45%)
Amean syst-pound_times-5 27.99 ( 0.00%) 21.31 ( 23.88%) 23.42 ( 16.32%) 17.95 ( 35.87%)
Amean syst-pound_times-8 47.33 ( 0.00%) 39.27 ( 17.04%) 47.16 ( 0.35%) 28.56 ( 39.66%)
Amean syst-pound_times-12 78.24 ( 0.00%) 58.26 ( 25.55%) 71.55 ( 8.55%) 42.78 ( 45.32%)
Amean syst-pound_times-21 148.75 ( 0.00%) 106.28 ( 28.55%) 138.22 ( 7.08%) 74.25 ( 50.09%)
Amean syst-pound_times-30 177.74 ( 0.00%) 121.16 ( 31.83%) 166.70 ( 6.21%) 81.82 ( 53.96%)
Amean syst-pound_times-48 184.85 ( 0.00%) 125.37 ( 32.18%) 170.87 ( 7.56%) 84.20 ( 54.45%)
Amean syst-pound_times-79 189.50 ( 0.00%) 127.45 ( 32.74%) 175.58 ( 7.34%) 86.01 ( 54.61%)
Amean syst-pound_times-96 190.56 ( 0.00%) 127.11 ( 33.30%) 175.08 ( 8.12%) 86.03 ( 54.85%)
Stddev real-pound_clock_gettime-2 0.25 ( 0.00%) 0.27 ( -7.76%) 0.41 (-65.62%) 0.10 ( 60.73%)
Stddev real-pound_clock_gettime-5 0.07 ( 0.00%) 0.09 (-35.10%) 0.51 (-674.46%) 0.05 ( 26.28%)
Stddev real-pound_clock_gettime-8 0.28 ( 0.00%) 0.02 ( 92.09%) 0.04 ( 86.10%) 0.02 ( 93.65%)
Stddev real-pound_clock_gettime-12 0.08 ( 0.00%) 0.02 ( 78.31%) 0.04 ( 52.02%) 0.02 ( 78.95%)
Stddev real-pound_clock_gettime-21 0.06 ( 0.00%) 0.02 ( 68.54%) 0.11 (-70.01%) 0.01 ( 78.27%)
Stddev real-pound_clock_gettime-30 0.05 ( 0.00%) 0.01 ( 75.00%) 0.10 (-98.93%) 0.04 ( 20.82%)
Stddev real-pound_clock_gettime-48 0.09 ( 0.00%) 0.19 (-106.51%) 0.08 ( 15.24%) 0.04 ( 58.70%)
Stddev real-pound_clock_gettime-79 0.03 ( 0.00%) 0.10 (-191.56%) 0.08 (-138.02%) 0.04 (-21.18%)
Stddev real-pound_clock_gettime-96 0.05 ( 0.00%) 0.08 (-56.69%) 0.07 (-21.04%) 0.04 ( 31.40%)
Stddev real-pound_times-2 0.55 ( 0.00%) 0.25 ( 53.82%) 0.38 ( 30.80%) 0.14 ( 74.19%)
Stddev real-pound_times-5 0.06 ( 0.00%) 0.28 (-358.77%) 0.13 (-108.26%) 0.03 ( 54.64%)
Stddev real-pound_times-8 0.25 ( 0.00%) 0.04 ( 83.52%) 0.06 ( 76.99%) 0.06 ( 76.94%)
Stddev real-pound_times-12 0.09 ( 0.00%) 0.05 ( 41.52%) 0.02 ( 77.55%) 0.04 ( 51.60%)
Stddev real-pound_times-21 0.06 ( 0.00%) 0.15 (-141.91%) 0.11 (-74.22%) 0.03 ( 48.73%)
Stddev real-pound_times-30 0.06 ( 0.00%) 0.14 (-129.04%) 0.10 (-66.59%) 0.04 ( 30.36%)
Stddev real-pound_times-48 0.05 ( 0.00%) 0.13 (-151.20%) 0.07 (-37.30%) 0.02 ( 54.64%)
Stddev real-pound_times-79 0.04 ( 0.00%) 0.11 (-205.48%) 0.07 (-97.82%) 0.03 ( 28.17%)
Stddev real-pound_times-96 0.05 ( 0.00%) 0.05 ( -1.83%) 0.04 ( 24.17%) 0.04 ( 20.00%)
Stddev syst-pound_clock_gettime-2 0.47 ( 0.00%) 0.45 ( 4.96%) 0.79 (-66.33%) 0.18 ( 61.36%)
Stddev syst-pound_clock_gettime-5 0.32 ( 0.00%) 0.39 (-20.09%) 2.49 (-666.63%) 0.25 ( 21.71%)
Stddev syst-pound_clock_gettime-8 2.25 ( 0.00%) 0.26 ( 88.54%) 0.40 ( 82.10%) 0.17 ( 92.55%)
Stddev syst-pound_clock_gettime-12 1.23 ( 0.00%) 0.43 ( 64.59%) 0.73 ( 40.82%) 0.19 ( 84.58%)
Stddev syst-pound_clock_gettime-21 1.15 ( 0.00%) 1.06 ( 7.62%) 2.64 (-129.56%) 0.66 ( 42.45%)
Stddev syst-pound_clock_gettime-30 1.34 ( 0.00%) 1.26 ( 6.25%) 2.69 (-99.81%) 1.58 (-17.86%)
Stddev syst-pound_clock_gettime-48 2.52 ( 0.00%) 4.85 (-92.44%) 2.12 ( 15.94%) 1.08 ( 57.23%)
Stddev syst-pound_clock_gettime-79 1.22 ( 0.00%) 2.51 (-105.82%) 1.99 (-62.56%) 0.96 ( 21.62%)
Stddev syst-pound_clock_gettime-96 1.54 ( 0.00%) 2.21 (-43.34%) 1.74 (-12.67%) 0.80 ( 48.24%)
Stddev syst-pound_times-2 1.09 ( 0.00%) 0.50 ( 53.61%) 0.76 ( 30.43%) 0.28 ( 74.11%)
Stddev syst-pound_times-5 0.30 ( 0.00%) 1.41 (-367.82%) 0.65 (-115.62%) 0.21 ( 29.66%)
Stddev syst-pound_times-8 2.12 ( 0.00%) 0.27 ( 87.24%) 0.71 ( 66.44%) 0.55 ( 73.94%)
Stddev syst-pound_times-12 1.03 ( 0.00%) 0.74 ( 27.70%) 0.37 ( 64.41%) 0.47 ( 54.81%)
Stddev syst-pound_times-21 1.60 ( 0.00%) 3.07 (-92.49%) 2.30 (-43.99%) 0.93 ( 41.93%)
Stddev syst-pound_times-30 1.75 ( 0.00%) 3.05 (-74.55%) 2.84 (-62.67%) 1.17 ( 32.95%)
Stddev syst-pound_times-48 0.79 ( 0.00%) 3.36 (-327.41%) 2.51 (-219.14%) 0.51 ( 34.63%)
Stddev syst-pound_times-79 1.08 ( 0.00%) 2.77 (-156.12%) 1.84 (-70.34%) 0.86 ( 20.82%)
Stddev syst-pound_times-96 1.19 ( 0.00%) 1.35 (-13.61%) 1.01 ( 15.16%) 1.29 ( -8.56%)
CoeffVar real-pound_clock_gettime-2 5.19 ( 0.00%) 8.73 (-68.19%) 9.97 (-92.16%) 7.43 (-43.23%)
CoeffVar real-pound_clock_gettime-5 1.19 ( 0.00%) 1.77 (-48.40%) 9.49 (-696.07%) 4.66 (-291.28%)
CoeffVar real-pound_clock_gettime-8 4.53 ( 0.00%) 0.45 ( 90.14%) 0.64 ( 85.76%) 1.95 ( 56.89%)
CoeffVar real-pound_clock_gettime-12 1.24 ( 0.00%) 0.37 ( 70.42%) 0.66 ( 46.83%) 2.00 (-60.60%)
CoeffVar real-pound_clock_gettime-21 0.88 ( 0.00%) 0.39 ( 55.78%) 1.61 (-82.98%) 1.58 (-79.84%)
CoeffVar real-pound_clock_gettime-30 0.68 ( 0.00%) 0.25 ( 63.35%) 1.44 (-112.50%) 4.49 (-560.29%)
CoeffVar real-pound_clock_gettime-48 1.18 ( 0.00%) 3.61 (-204.68%) 1.09 ( 8.14%) 4.06 (-243.23%)
CoeffVar real-pound_clock_gettime-79 0.43 ( 0.00%) 1.85 (-333.44%) 1.10 (-157.77%) 3.87 (-808.42%)
CoeffVar real-pound_clock_gettime-96 0.68 ( 0.00%) 1.59 (-134.97%) 0.89 (-31.18%) 3.28 (-383.77%)
CoeffVar real-pound_times-2 9.79 ( 0.00%) 7.15 ( 26.98%) 8.39 ( 14.33%) 6.21 ( 36.61%)
CoeffVar real-pound_times-5 1.06 ( 0.00%) 6.39 (-501.98%) 2.64 (-148.46%) 0.75 ( 29.71%)
CoeffVar real-pound_times-8 4.24 ( 0.00%) 0.84 ( 80.17%) 0.98 ( 76.91%) 1.61 ( 61.95%)
CoeffVar real-pound_times-12 1.29 ( 0.00%) 1.01 ( 22.11%) 0.32 ( 75.51%) 1.14 ( 12.18%)
CoeffVar real-pound_times-21 0.87 ( 0.00%) 2.91 (-236.03%) 1.63 (-87.60%) 0.87 ( -0.97%)
CoeffVar real-pound_times-30 0.78 ( 0.00%) 2.62 (-233.35%) 1.40 (-78.41%) 1.16 (-47.41%)
CoeffVar real-pound_times-48 0.65 ( 0.00%) 2.40 (-269.32%) 0.97 (-48.56%) 0.64 ( 2.00%)
CoeffVar real-pound_times-79 0.45 ( 0.00%) 2.03 (-352.70%) 0.96 (-113.68%) 0.71 (-57.00%)
CoeffVar real-pound_times-96 0.61 ( 0.00%) 0.93 (-52.43%) 0.50 ( 17.50%) 1.07 (-75.79%)
CoeffVar syst-pound_clock_gettime-2 5.12 ( 0.00%) 8.25 (-60.95%) 9.95 (-94.12%) 7.81 (-52.47%)
CoeffVar syst-pound_clock_gettime-5 1.20 ( 0.00%) 1.61 (-34.04%) 9.48 (-689.57%) 5.37 (-346.99%)
CoeffVar syst-pound_clock_gettime-8 4.66 ( 0.00%) 0.67 ( 85.58%) 0.86 ( 81.64%) 2.53 ( 45.79%)
CoeffVar syst-pound_clock_gettime-12 1.56 ( 0.00%) 0.76 ( 51.21%) 1.02 ( 34.25%) 1.95 (-25.60%)
CoeffVar syst-pound_clock_gettime-21 0.77 ( 0.00%) 1.01 (-30.89%) 1.91 (-147.96%) 4.01 (-420.63%)
CoeffVar syst-pound_clock_gettime-30 0.75 ( 0.00%) 1.05 (-39.54%) 1.61 (-113.72%) 8.48 (-1026.12%)
CoeffVar syst-pound_clock_gettime-48 1.36 ( 0.00%) 3.89 (-186.46%) 1.24 ( 8.75%) 5.11 (-276.18%)
CoeffVar syst-pound_clock_gettime-79 0.65 ( 0.00%) 1.99 (-206.55%) 1.14 (-75.81%) 3.98 (-512.73%)
CoeffVar syst-pound_clock_gettime-96 0.81 ( 0.00%) 1.74 (-115.35%) 0.99 (-22.26%) 3.05 (-277.01%)
CoeffVar syst-pound_times-2 9.86 ( 0.00%) 7.29 ( 26.04%) 8.54 ( 13.39%) 6.45 ( 34.55%)
CoeffVar syst-pound_times-5 1.08 ( 0.00%) 6.62 (-514.60%) 2.78 (-157.67%) 1.18 ( -9.68%)
CoeffVar syst-pound_times-8 4.48 ( 0.00%) 0.69 ( 84.62%) 1.51 ( 66.32%) 1.94 ( 56.81%)
CoeffVar syst-pound_times-12 1.32 ( 0.00%) 1.28 ( 2.89%) 0.51 ( 61.08%) 1.09 ( 17.35%)
CoeffVar syst-pound_times-21 1.07 ( 0.00%) 2.89 (-169.42%) 1.66 (-54.96%) 1.25 (-16.34%)
CoeffVar syst-pound_times-30 0.98 ( 0.00%) 2.52 (-156.06%) 1.71 (-73.43%) 1.43 (-45.64%)
CoeffVar syst-pound_times-48 0.43 ( 0.00%) 2.68 (-530.20%) 1.47 (-245.25%) 0.61 (-43.50%)
CoeffVar syst-pound_times-79 0.57 ( 0.00%) 2.17 (-280.81%) 1.05 (-83.84%) 1.00 (-74.46%)
CoeffVar syst-pound_times-96 0.63 ( 0.00%) 1.07 (-70.33%) 0.58 ( 7.66%) 1.50 (-140.44%)
Max real-pound_clock_gettime-2 5.10 ( 0.00%) 3.56 ( 30.20%) 4.98 ( 2.35%) 1.47 ( 71.18%)
Max real-pound_clock_gettime-5 5.59 ( 0.00%) 5.10 ( 8.77%) 6.00 ( -7.33%) 1.17 ( 79.07%)
Max real-pound_clock_gettime-8 6.82 ( 0.00%) 4.95 ( 27.42%) 6.02 ( 11.73%) 0.93 ( 86.36%)
Max real-pound_clock_gettime-12 6.82 ( 0.00%) 4.93 ( 27.71%) 6.13 ( 10.12%) 0.90 ( 86.80%)
Max real-pound_clock_gettime-21 7.33 ( 0.00%) 5.17 ( 29.47%) 7.01 ( 4.37%) 0.89 ( 87.86%)
Max real-pound_clock_gettime-30 7.71 ( 0.00%) 5.24 ( 32.04%) 7.38 ( 4.28%) 1.00 ( 87.03%)
Max real-pound_clock_gettime-48 8.11 ( 0.00%) 5.86 ( 27.74%) 7.47 ( 7.89%) 1.05 ( 87.05%)
Max real-pound_clock_gettime-79 8.03 ( 0.00%) 5.53 ( 31.13%) 7.48 ( 6.85%) 1.13 ( 85.93%)
Max real-pound_clock_gettime-96 8.05 ( 0.00%) 5.55 ( 31.06%) 7.51 ( 6.71%) 1.21 ( 84.97%)
Max real-pound_times-2 6.66 ( 0.00%) 3.89 ( 41.59%) 5.23 ( 21.47%) 2.56 ( 61.56%)
Max real-pound_times-5 5.77 ( 0.00%) 4.96 ( 14.04%) 5.01 ( 13.17%) 3.69 ( 36.05%)
Max real-pound_times-8 6.42 ( 0.00%) 5.04 ( 21.50%) 6.02 ( 6.23%) 3.72 ( 42.06%)
Max real-pound_times-12 6.69 ( 0.00%) 5.07 ( 24.22%) 6.07 ( 9.27%) 3.67 ( 45.14%)
Max real-pound_times-21 7.32 ( 0.00%) 5.63 ( 23.09%) 7.00 ( 4.37%) 3.68 ( 49.73%)
Max real-pound_times-30 7.78 ( 0.00%) 5.68 ( 26.99%) 7.36 ( 5.40%) 3.66 ( 52.96%)
Max real-pound_times-48 7.98 ( 0.00%) 5.58 ( 30.08%) 7.41 ( 7.14%) 3.68 ( 53.88%)
Max real-pound_times-79 8.05 ( 0.00%) 5.61 ( 30.31%) 7.53 ( 6.46%) 3.69 ( 54.16%)
Max real-pound_times-96 8.08 ( 0.00%) 5.42 ( 32.92%) 7.42 ( 8.17%) 3.71 ( 54.08%)
Max syst-pound_clock_gettime-2 9.91 ( 0.00%) 6.30 ( 36.43%) 9.64 ( 2.72%) 2.68 ( 72.96%)
Max syst-pound_clock_gettime-5 27.53 ( 0.00%) 24.74 ( 10.13%) 29.35 ( -6.61%) 5.43 ( 80.28%)
Max syst-pound_clock_gettime-8 53.96 ( 0.00%) 38.82 ( 28.06%) 47.75 ( 11.51%) 6.99 ( 87.05%)
Max syst-pound_clock_gettime-12 81.09 ( 0.00%) 57.99 ( 28.49%) 71.93 ( 11.30%) 10.04 ( 87.62%)
Max syst-pound_clock_gettime-21 151.50 ( 0.00%) 107.03 ( 29.35%) 145.33 ( 4.07%) 17.48 ( 88.46%)
Max syst-pound_clock_gettime-30 179.94 ( 0.00%) 121.68 ( 32.38%) 172.10 ( 4.36%) 21.29 ( 88.17%)
Max syst-pound_clock_gettime-48 191.29 ( 0.00%) 136.82 ( 28.48%) 174.84 ( 8.60%) 23.80 ( 87.56%)
Max syst-pound_clock_gettime-79 190.22 ( 0.00%) 130.28 ( 31.51%) 177.26 ( 6.81%) 25.71 ( 86.48%)
Max syst-pound_clock_gettime-96 192.02 ( 0.00%) 132.27 ( 31.12%) 178.26 ( 7.17%) 27.66 ( 85.60%)
Max syst-pound_times-2 13.10 ( 0.00%) 7.57 ( 42.21%) 10.21 ( 22.06%) 4.89 ( 62.67%)
Max syst-pound_times-5 28.56 ( 0.00%) 24.55 ( 14.04%) 24.80 ( 13.17%) 18.20 ( 36.27%)
Max syst-pound_times-8 50.89 ( 0.00%) 39.54 ( 22.30%) 47.78 ( 6.11%) 29.45 ( 42.13%)
Max syst-pound_times-12 79.85 ( 0.00%) 59.80 ( 25.11%) 72.21 ( 9.57%) 43.27 ( 45.81%)
Max syst-pound_times-21 151.33 ( 0.00%) 115.02 ( 23.99%) 144.60 ( 4.45%) 75.85 ( 49.88%)
Max syst-pound_times-30 180.79 ( 0.00%) 130.12 ( 28.03%) 171.98 ( 4.87%) 83.31 ( 53.92%)
Max syst-pound_times-48 186.61 ( 0.00%) 130.89 ( 29.86%) 174.40 ( 6.54%) 84.85 ( 54.53%)
Max syst-pound_times-79 190.96 ( 0.00%) 133.09 ( 30.30%) 179.58 ( 5.96%) 87.17 ( 54.35%)
Max syst-pound_times-96 192.42 ( 0.00%) 128.95 ( 32.99%) 177.09 ( 7.97%) 87.82 ( 54.36%)

vanilla rever prefetc mas
4.7 revert prefetch mask
User 54.91 73.30 56.08 47.56
System 21115.14 14616.16 19553.36 6360.52
Elapsed 1247.71 890.24 1149.26 409.20

vanilla rever prefetc mas
4.7 revert prefetch mask
Minor Faults 291321 267632 324632 274236
Major Faults 196 272 279 279
Swap Ins 0 0 0 0
Swap Outs 0 0 0 0
Allocation stalls 0 0 0 0
DMA allocs 0 0 0 0
DMA32 allocs 12836 11773 23439 21745
Normal allocs 252492 245667 302327 270404
Movable allocs 0 0 0 0
Direct pages scanned 0 0 0 0
Kswapd pages scanned 0 0 0 0
Kswapd pages reclaimed 0 0 0 0
Direct pages reclaimed 0 0 0 0
Kswapd efficiency 100% 100% 100% 100%
Kswapd velocity 0.000 0.000 0.000 0.000
Direct efficiency 100% 100% 100% 100%
Direct velocity 0.000 0.000 0.000 0.000
Percentage direct scans 0% 0% 0% 0%
Zone normal velocity 0.000 0.000 0.000 0.000
Zone dma32 velocity 0.000 0.000 0.000 0.000
Zone dma velocity 0.000 0.000 0.000 0.000
Page writes by reclaim 0.000 0.000 0.000 0.000
Page writes file 0 0 0 0
Page writes anon 0 0 0 0
Page reclaim immediate 0 0 0 0
Sector Reads 24440 38464 144944 143876
Sector Writes 569300 12712 16036 6956
Page rescued immediate 0 0 0 0
Slabs scanned 0 0 0 0
Direct inode steals 0 0 0 0
Kswapd inode steals 0 0 0 0
Kswapd skipped wait 0 0 0 0
THP fault alloc 0 0 0 0
THP collapse alloc 0 0 0 0
THP splits 0 0 0 0
THP fault fallback 0 0 0 0
THP collapse fail 0 0 0 0
Compaction stalls 0 0 0 0
Compaction success 0 0 0 0
Compaction failures 0 0 0 0
Page migrate success 11177 10858 14598 9857
Page migrate failure 0 2 1 1
Compaction pages isolated 0 0 0 0
Compaction migrate scanned 0 0 0 0
Compaction free scanned 0 0 0 0
Compaction cost 11 11 15 10
NUMA alloc hit 237281 229068 296261 263464
NUMA alloc miss 7 5 5 6
NUMA interleave hit 0 0 0 0
NUMA alloc local 237281 229068 296261 263464
NUMA base PTE updates 25433 20398 35883 22264
NUMA huge PMD updates 0 0 0 0
NUMA page range updates 25433 20398 35883 22264
NUMA hint faults 23242 18097 31026 17002
NUMA hint local faults 10012 6038 14657 6903
NUMA hint local percent 43 33 47 40
NUMA pages migrated 11177 10858 14598 9857
AutoNUMA cost 116% 90% 155% 85%