Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT

From: Ingo Molnar
Date: Tue Jan 20 2009 - 10:01:31 EST

* Nick Piggin <npiggin@xxxxxxx> wrote:

> On Tue, Jan 20, 2009 at 03:17:35PM +0100, Ingo Molnar wrote:
> >
> > * Nick Piggin <npiggin@xxxxxxx> wrote:
> > >
> > > BTW. the lmbench test I run directly (it's called lat_mmap.c, and gets
> > > compiled into a standalone lat_mmap exec by the standard lmbench build).
> >
> > doesnt that include an indeterminate number of gettimeofday() based
> > calibration calls? That would make it harder to measure its total costs in
> > a comparative way.
> Hmm... yes probably for really detailed profile comparisons or other
> external measurements it would need modification.


Btw., it's a trend to be aware of i think: as our commit flux goes up and
the average commit size goes down, it becomes harder and harder to measure
the per commit performance impact.

There's just 3 ways to handle it: decrease commit flux (which is out of
question), or to increase commits size (wich is out of question as well),
or to improve the quality of our measurements.

We can improve performance measurement quality in a number of ways:

- We can (and should) increase instrumentation precision

/usr/bin/time's 10 msec measurement granularity might have been
fine a decade ago but it is not fine today.

- We can (and should) increase the number of 'dimensions' (metrics) we
can instrument the kernel with.

Right now we basically only measure along the time axis, in 99% of
the cases. But 'elapsed time' is a tricky, compound and thus noisy
unit: it is affected by all delays in a workload. We do profiles
occasionally, but they are a lot more difficult to generate and a lot
harder to compare and are hard to be plugged into regression

So if we see a statistically significant shift in one of more metrics of
something like:

| $ ./timec -e -5,-4,-3,0,1,2,3 make -j16 bzImage
| [...]
| Kernel: arch/x86/boot/bzImage is ready (#28)
| Performance counter stats for 'make':
| 628315.871980 task clock ticks (msecs)
| 42330 CPU migrations (events)
| 124980 context switches (events)
| 18698292 pagefaults (events)
| 1351875946010 CPU cycles (events)
| 1121901478363 instructions (events)
| 10654788968 cache references (events)
| 633581867 cache misses (events)
| Wall-clock time elapsed: 118348.109066 msecs

Becomes a _lot_ harder to ignore (and talk out of existence) than it is to
ignore a few minor digits changing in:

| $ time make -j16 bzImage
| real 0m12.146s
| user 1m30.050s
| sys 0m12.757s

( Especially as those minor digits tend to be rather noisy to begin with,
due to us sampling system/user time from the timer interrupt. )

It becomes even harder to ignore statistically significant regressions if
some of the metrics are hardware-generated hard physical facts - not
something wishy-washy and statistical as stime/utime statistics.

</plug> ;-)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at