RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default

From: Dexuan Cui
Date: Fri Aug 18 2017 - 19:08:05 EST


> From: Stefan Hajnoczi [mailto:stefanha@xxxxxxxxxx]
> > CID is not really used by us, because we only support guest<->host
> communication,
> > and don't support guest<->guest communication. The Hyper-V host
> references
> > every VM by VmID (which is invisible to the VM), and a VM can only talk to
> the
> > host via this feature.
>
> Applications running inside the guest should use VMADDR_CID_HOST (2) to
> connect to the host, even on Hyper-V.
I have no objection, and this patch does support this usage of the
user-space applications.

> By the way, we should collaborate on a test suite and a vsock(7) man
> page that documents the semantics of AF_VSOCK sockets. This way our
> transports will have the same behavior and AF_VSOCK applications will
> work on all 3 hypervisors.
I can't agree more. :-)
BTW, I have been using Rolf's test suite to test my patch:
https://github.com/rn/virtsock/tree/master/c
Maybe this can be a good starting point.

> Not all features need to be supported. For example, VMCI supports
> SOCK_DGRAM while Hyper-V and virtio do not. But features that are
> available should behave identically.
I totally agree, though I'm afraid Hyper-V may have a little more limitations
compared to VMware/KVM duo to the <VM_ID, ServiceID> <--> <cid, port>
mapping.

> > Can we use the 'protocol' parameter in the socket() function:
> > int socket(int domain, int type, int protocol)
> >
> > IMO currently the 'protocol' is not really used.
> > I think we can modify __vsock_core_init() to allow multiple transport layers
> to
> > be registered, and we can define different 'protocol' numbers for
> > VMware/KVM/Hyper-V, and ask the application to explicitly specify what
> should
> > be used. Considering compatibility, we can use the default transport in a
> given
> > VM depending on the underlying hypervisor.
>
> I think AF_VSOCK should hide the transport from users/applications.
Ideally yes, but let's consider the KVM-on-KVM nested scenario: when
an application in the Level-1 VM creates an AF_VSOCK socket and call
connect() for it, how can we know if the app is trying to connect to
the Level-0 host, or connect to the Level-2 VM? We can't. That's why
I propose we should use the 'protocol' parameter to distinguish between
"to guest" and "to host".

With my proposal, in the above scenario, by default (the 'protocol' is 0),
we choose the "to host" transport layer when socket() is called; if the
userspace app explicitly specifies "to guest", we choose the "to guest"
transport layer when socket() is called. This way, the connect(), bind(), etc.
can work automatically.
(Of course, the default transport for a give VM can be better chosen
if we detect which nested level the app is running on.)

> Think of same-on-same nested virtualization: VMware-on-VMware or
> KVM-on-KVM. In that case specifying VMCI or virtio doesn't help.
>
> We'd still need to distinguish between "to guest" and "to host"
> (currently VMCI has code to do this but virtio does not).
>
> The natural place to distinguish the destination is when dealing with
> the sockaddr in connect(), bind(), etc.
>
> Stefan

Thanks,
-- Dexuan