Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to testkernels

From: Ingo Molnar
Date: Mon Nov 07 2011 - 13:01:45 EST



* Vince Weaver <vince@xxxxxxxxxx> wrote:

> On Mon, 7 Nov 2011, Pekka Enberg wrote:
>
> > I've never heard ABI incompatibility used as an argument for
> > perf. Ingo?

Correct, the ABI has been designed in a way to make it really hard to
break the ABI via either directed backports or other mess-ups.

The ABI is both backwards *and* forwards ABI compatible, which is
very rare amongst Linux ABIs.

For frequently used tools, such as perf, there's no ABI compatibility
problem in practice: using newer perf on older kernels is pretty
common. Using older perf on new kernels is rarer, but that generally
works too.

In hindsight being in the kernel repo made it *easier* for perf to
implement a good, stable ABI while also keeping a very high rate of
change of the subsystem: changes are more 'concentrated' and people
can stay focused on the ball to extend the ABI in sensible ways
instead of struggling with project boundary artifacts.

I think we needed to do only one revert along the way in the past two
years, to fix an unintended ABI breakage in PowerTop. Considering the
total complexity of the perf ABI our compatibility track record is
*very* good.

> Never overtly. They're too clever for that.

Pekka, Vince has meanwhile become the resident perf critic on lkml,
always in it when it comes to some perf-bashing:

> In any case, as a primary developer of a library (PAPI) that uses
> the perf_events ABI I have to say that having perf in the kernel
> has been a *major* pain for us.

... and you have argued against perf from the very first day on, when
you were one of the perfmon developers - and IMO in hindsight you've
been repeatedly wrong about most of your design arguments.

> Unlike the perf developers, we *do* have to maintain backwards
> compatability. [...]

We do too, i use new perf on older distro kernels all the time. If
you see a breakage of functionality that tools use and report in a
timely fashion then please report it.

> [...] And we have a lot of nasty code in PAPI to handle this.
> Entirely because the perf_events ABI is not stable. It's mostly
> stable, but there are enough regressions to be a pain.

You are blaming the wrong guys really.

The PAPI project has the (fundamental) problem that you are still
doing it in the old-style sw design fashion, with many months long
delays in testing, and then you are blaming the problems you
inevitably meet with that model on *us*.

There was one PAPI incident i remember where it took you several
*months* to report a regression in a regular PAPI test-case (no
actual app affected as far as i know). No other tester ever ran the
PAPI testcases so nobody else reported it.

Moving perf out of the kernel would make that particular situation
*worse*, by further increasing the latency of fixes and by further
increasing the risk of breakages.

Sorry, but you are trying to "fix" perf by dragging it down to your
bad level of design and we will understandably resist that ...

> It's problem enough that there's no way to know what version of the
> perf_event abi you are running against and we have to guess based
> on kernel version. This gets "fun" because all of the vendors have
> backported seemingly random chunks of perf_event code to their
> older kernels.

The ABI design allows for that kind of flexible extensibility, and
it's one of its major advantages.

What we *cannot* protect against is you relying on obscure details of
the ABI without adding it to 'perf test' and then not testing the
upstream kernel in a timely enough fashion either ...

Nobody but you tests PAPI so you need to become *part* of the
upstream development process, which releases a new upstream kernel
every 3 months.

> And it often does seem as the perf developers don't care when
> something breaks in perf_events if it doesn't affect perf users.

I have to reject your slander, both Peter, Arnaldo and me care deeply
about fixing regressions and i've personally applied fixes out of
order that addressed some sort of PAPI problem - whenever you chose
to report them.

Vince, you are wrong and you have also become somewhat malicious in
your arguments - please stop it.

> For example, the new NMI watchdog severely breaks perf_event event
> allocation if you are using FORMAT_GROUP. perf doesn't use this
> though, so none of the kernel developers seem to care. And unless
> I can quickly come up with a patch as an outsider, a few kernel
> versions will go by and the kernel devs will declare "well it was
> broken so long, now we don't have to fix it". Fun.

Face it, the *real* problem is that beyond yourself very few people
who use a new kernel use PAPI and your long latency of testing
exposes you to breakages in a much more agile subsystem such as perf.
Please fix that instead of blaming it on others.

Also, as i mentioned it several times before, you are free to add an
arbitrary number of ABI test-cases to 'perf test' and we can promise
that we run that. Right now it consists of a few tests:

$ perf test
1: vmlinux symtab matches kallsyms: Ok

2: detect open syscall event: Ok
3: detect open syscall event on all cpus: Ok
4: read samples using the mmap interface: Ok

... but we do not object to adding testcases for functionality used
by PAPI.

The usual ABI rules also apply: we'll revert everything that breaks
the ABI - but for that you need to report it *in time*, not timed one
day before the next -stable release like you did it last time around
...

So there's several ways of how you could help push your own interests
into the kernel project.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/