Re: kvmtool tree (Was: Re: [patch] config: fix make kvmconfig)

From: Ingo Molnar
Date: Mon Feb 11 2013 - 07:27:11 EST

Next message: tip-bot for Sasha Levin: "[tip:core/locking] liblockdep: Add rbtree support"
Previous message: tip-bot for Sasha Levin: "[tip:core/locking] liblockdep: Correct the ABCDBCDA test"
In reply to: Ingo Molnar: "Re: kvmtool tree (Was: Re: [patch] config: fix make kvmconfig)"
Next in thread: Ingo Molnar: "Re: kvmtool tree (Was: Re: [patch] config: fix make kvmconfig)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Sun, Feb 10, 2013 at 6:39 AM, Pekka Enberg <penberg@xxxxxxxxxx> wrote:
> >
> > The main argument for merging into the main kernel
> > repository has always been that (we think) it improves the
> > kernel because significant amount of development is directly
> > linked to kernel code (think KVM ARM port here, for
> > example). The secondary argument has been to make it easy
> > for kernel developers to work on both userspace and kernel
> > in tandem (like has happened with vhost drivers). In short:
> > it speeds up development of Linux virtualization code.
>
> Why? You've made this statement over and over and over again,
> and I've dismissed it over and over and over again because I
> simply don't think it's true.
>
> It's simply a statement with nothing to back it up. Why repeat
> it?
>
> THAT is my main contention. I told you why I think it's
> actually actively untrue. You claim it helps, but what is it
> about kvmtool that makes it so magically helpful to be inside
> the kernel repository? What is it about this that makes it so
> critical that you get the kernel and kvmtool with a single
> pull, and they have to be in sync? [...]

If you are asking whether it is critical for the kernel project
to have tools/kvm/ integrated then it isn't. The kernel will
live just fine without it, even if that decision is a mistake.

[ In hindsight not taking the GGI code 15+ years ago was IMO a
(bad) mistake - yet we lived. ]

I think it's actively *useful* to the kernel project to have
tools/kvm/ - because we already reaped some benefits and have
the commit IDs to prove it.

If you are asking why it is helpful to the tools/kvm project to
be part of the kernel repository then there's plenty of (good)
reasons as well. (And because it's the much smaller project, the
benefits are much more significant to it than benefits are to
the Linux kernel project, relatively. You'll find that to be
true with just about any code.)

Is any of those reasons of why it's good for tools/kvm/ to be in
the kernel repo critical? I think the *combination* is
definitely critical. It's very much possible for each factor to
seem 'small' in isolation but for the combination to be
significant - denying that would be fallacy of composition.

Let me list them in case there's anything new that was not said
before. Some of the advantages are social, some are technical:

1) 'tooling and kernel side support goes hand in hand'

I can best describe this from the tools/perf/ perspective:
reviewing new kernel side features that has tooling impact is a
*LOT* easier and a lot faster if it comes with readable,
functional tooling patches.

There's no ifs and whens about it, and that alone makes
tools/perf/ worth it to such a degree that we imposed a
maintenance rule so that kernel side features always need to
come with enabling tooling support.

With tools/kvm/ I saw similar effects as well - on a smaller
scale, because due to not being upstream tools/kvm/ cannot
realistically improve upon ABIs nearly as well as tools/perf/
can. Those effects will strengthen as the project grows.

For tools/kvm/ this property is optional, so unlike tools/perf/
you don't see it for every activity there - but there were
several examples of that despite its optionality.

2) 'code reuse'

We utilize useful kernel code directly in user-space. It starts
out ad-hoc and messy (and I still like Al Viro's description of
that process back from the tools/perf/ flamewars).

We have a tools/kvm/ example of that process in action: for
example an upcoming v3.9 feature, the user-space lockdep utility
enabled via tools/lib/lockdep/. (Although now you might NAK
that, I don't really understand your underlying position here.)

I am pretty confident to say that the new liblockdep and the
'lockdep' utility (which checks pthread_mutex and pthread_rwlock
locking in user-space - on existing binaries, using LD_PRELOAD),
despite having been talked about for years, would simply not
have happened without tools/kvm/ present in a kernel repo, full
stop.

Not this year, not next year, probably not this decade. The
reason is that the code needed several unlikely constellations
to coincide:

- tools/kvm attracted a capable contributor who never wrote
kernel code before but who was interested in user-space
coding and in virtualization code.

- this person, over the past 2 years, learned the ropes and
gradually started writing kernel code as well.

- he also learned how to interact tooling with the kernel
proper. First the messy way, then in gradually less messy
ways.

- tools/kvm/ uses a user-space equivalent of kernel locking
primitives, such a mutex_lock()/mutex_unlock(), so his
experience with tools/kvm/ locking helped him kick-start
into looking at kernel-side locking.

- he got to the level where he would understand lockdep.c,
a pretty non-trivial piece of kernel code.

- he ended up gradually validating whether lockdep could be
ported to user-space. He first used 'messy' integration:
kernel/lockdep.c hacked up badly and linked directly into
user-space app. Then he did 'clean' integration: some
modifications to kernel/lockdep.c enabled it to be
librarified, and then the remaining work was done in
user-space - here too in successive steps.

- tools/kvm/ happened to be hosted in the same kernel repo
that the locking tree is hosted in.

The end result is something good that I never saw happen to
kernel code before, in the last 20 years of the Linux kernel.
Maybe it could have happened with an outside tools/kvm repo, but
I very strongly suspect that it would not.

In theory this could have been done in the cold, fragmented,
isolated and desolate landscape of Linux user-space utilities,
by copying kernel/lockdep.c and a handful of kernel headers to
user-space, and making it work there somehow.

Just like a blue rose could in theory grow on Antarctica as
well, given the right set of circumstances. It just so happens
that blue roses best grow in Holland, where there's good support
infrastructure for growing green stuff, while you'd have to look
hard to find any green stuff at all on Antarctica.

Now is user-space lockdep something fundamental and important?

I think it's not critical in terms of technology (any of us can
only do small code changes really), but having a new breed of
contributors who are good at both kernel and user-space coding,
and who do that as part of a single contribution community, is
both refreshing and potentially important.

[ Obviously I'm seeing similar goodness in tools/perf/ as well,
and forcing it to split off from the kernel repo would be a
sad step backwards. ]

3) 'trust, distribution, testing, ease of use'

I personally tend to install a single Git tree on a test machine
when testing the kernel: a single kernel repo. I keep that one
updated, it's the only variable factor on that box - I don't
change /etc/ if I can avoid it and I don't install packages and
don't build utilities from source.

Any utility I rely on either comes with the kernel proper, or is
already installed on the box (potentially 5 years old) - or does
not get updated (or used much). Yes, I could clone utility Git
repositories - but there's a barrier of usage due to several
factors:

- I'd have to figure out which Git repo to pull and whether to
trust it. I know I can generally trust the kernel repo so I
don't mind about doing a 'make install' there as root.

- I'd have to make sure that the Git repo is really the latest
and current one of that utility. If I really only need that
utility marginally, why should I bother?

- I know how to build and install it, because it follows
similar principles.

- I know how to fix and enhance it, should I feel the need,
by using the established kernel community contribution
infrastructure.

- Several of my test boxes have old distros for compatibility
testing, where package updates and install don't work anymore
because all the URIs broke already, years ago. So installing
from source is the only option to get a recent utility.

The kernel repo gives me a single reference of 'trusted and up
to date' stuff I need for kernel development. I only have to
update it once and I know it's all uptodate and relevant.

If you look at any of these factors in isolation it feels small
and borderline. In combination it's compelling to me.

Could I install a utility via distro packaging or via pulling
another Git tree? Possibly, but see the barriers above.

4) 'We get maintenance culture imposed'

The kernel project basically offers a template and an
enforcement mechanism. It is a very capable incubator for
smaller projects, and I think that's a very good and useful
thing.

I'm not aware of any similar incubators - the utility landscape
is sadly very fragmented, with no meta project that holds it
together, and we are hurting from that.

Could an outside project enforce the same maintenance culture?
Only if the maintainer is very good and is doing it for the
whole life-time of the project - and even then it would be done
at an increased cost - right now we can just piggy back to the
existing kernel project contribution quality rules.

In practice I've seen plenty of projects that started out good
and then years down the road entropy ate their quality.

Too much freedom to mess up and all that - sharing
infrastructure by related projects is good in most cases, why do
we have to *insist* on projects to live separately and isolated?

5) 'We get to be a (minor) part of a larger, already established
community.'

Barriers of entry and barriers of progress are much lower within
a single project.

Furthermore, if you are a contributor who *disagrees* with the
concept of a cold, fragmented, inefficient and unproductive
Linux utilities landscape that lacks a meta project framework to
insert sanity then it's only natural to desire to be part of a
sane project and not create yet another new, isolated project.

[ As the leader of the larger project you are obviously fully
within your rights to reject community membership, if you feel
the code is harmful or just not useful enough. ]

> [...] When you then at the same time claim that you make very
> sure that they don't have to be in sync at all. See your
> earlier emails about how you claim to have worked very hard to
> make sure they work across different versions.

I don't think there's any contradiction, the two concepts are
not exclusive, it's similar to tools/perf/:

It's *very* useful to have integration, in terms of improving
the various conditions for contribution and in terms of enabling
code to flow efficiently both into the kernel and into tooling.

But it's not *required*, we obviously want ABI compatibility,
want older versions to still work, etc.

So suggesting that there's a contradiction is a false dichotomy.

> So you make these unsubstantiated claims about how much easier
> it is, and they make no sense. You never explain *why* it's so
> magically easier. Is git so hard to use that you can't do "git
> pull" twice? And why would you normally even *want* to do git
> pull twice? 99% of the work in the kernel has nothing
> what-so-ever to do with kvmtool, and hopefully the reverse is
> equally true.

The target user base of tools/kvm/ is developers. If my personal
experience as a tester/user of utilities in a heterogenous test
environment matters to you:

I think the only non-kernel Git repo I ever pulled to a test box
was the Git repo - and that was not voluntary, a 5 years old Git
binary broke on the test box so I had to rebuild it.

I don't pull them because I had bad experience with most of
them: they create /etc footprint that might interact with the
validity of my ongoing testing (I try to keep installations
pristine), quite a few of them simply don't compile on older
systems, and they are also rather dissimilar in terms of how to
build, install & run them. (I also find it a bit sisyphean to
put effort into a utilities model that I don't think works very
well.)

> And tying into the kernel just creates this myopic world of
> only looking at the current kernel. What if somebody decides
> that they actually want to try to boot Windows with kvmtool?

IIRC Windows support for kmvtool is work in progress - some
patches already got applied.

Is Windows support a no-no for the Linux kernel repo?

> What if somebody tells you that they are really tired of Xen,
> and actually want to turn kvmtool into *replacement* for Xen
> instead? [...]

Actually, this was raised by some people - and I think some
generalization patches were applied already but Pekka might know
more about that ...

> [...] What if somebody wants to branch off their own work,
> concentrating on some other issue entirely, and wants to merge
> with upstream kvmtool but not worry about the kernel, because
> they aren't working on the Linux kernel at all, and their work
> is about something else?

I'm not sure I understand this question - tools/kvm/ only runs
on a Linux kernel host, rather fundamentally, by using the (very
Linux specific) KVM syscalls.

Hypothetically, if some other OS offered full KVM syscall
compatibility and would start driving KVM development, then
tools/kvm/ could accept patches related to that.

As long as the code is clean I see no problems, it would even be
good because it might help put new features into KVM, should
that 'other OS' improve upon the KVM syscalls. In terms of
tools/kvm/ development we'd still think of that other OS as some
Linux fork in essence.

So I'm not sure I fully understood this particular concern of
yours.

Are you thinking about what happens if Linux itself dies down
and gets replaced by some other OS, dragging down 'hosted' code
with it? That would be very disruptive to a whole lot of other
code as well, such as more obscure drivers, filesystems and
kernel features that are currently only present in Linux - all
of which would eventually find a new home with the new king OS,
with different levels of costs of porting.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: tip-bot for Sasha Levin: "[tip:core/locking] liblockdep: Add rbtree support"
Previous message: tip-bot for Sasha Levin: "[tip:core/locking] liblockdep: Correct the ABCDBCDA test"
In reply to: Ingo Molnar: "Re: kvmtool tree (Was: Re: [patch] config: fix make kvmconfig)"
Next in thread: Ingo Molnar: "Re: kvmtool tree (Was: Re: [patch] config: fix make kvmconfig)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]