Re: [PATCH] kvm: x86: disable KVM_FAST_MMIO_BUS

From: Paolo Bonzini
Date: Wed Aug 16 2017 - 13:25:15 EST


On 16/08/2017 18:50, Michael S. Tsirkin wrote:
> On Wed, Aug 16, 2017 at 03:30:31PM +0200, Paolo Bonzini wrote:
>> While you can filter out instruction fetches, that's not enough. A data
>> read could happen because someone pointed the IDT to MMIO area, and who
>> knows what the VM-exit instruction length points to in that case.
>
> Thinking more about it, I don't really see how anything
> legal guest might be doing with virtio would trigger anything
> but a fault after decoding the instruction. How does
> skipping instruction even make sense in the example you give?

There's no such thing as a legal guest. Anything that the hypervisor
does, that differs from real hardware, is a possible escalation path.

This in fact makes me doubt the EMULTYPE_SKIP patch too.

>>>> Plus of course it wouldn't be guaranteed to work on nested.
>>>
>>> Not sure I got this one.
>>
>> Not all nested hypervisors are setting the VM-exit instruction length
>> field on EPT violations, since it's documented not to be set.
>
> So that's probably the real issue - nested virt which has to do it
> in software at extra cost. We already limit this to intel processors,
> how about we blacklist nested virt for this optimization?
>
> I agree it's skating it a bit close to the dangerous edge,
> but so are other tricks we play with PTEs to speed up MMIO.

Not at all. Everything else we do is perfectly fine according to the
spec, this one isn't.

Paolo

>>>>>> Adding a hypercall or MSR write that does a fast MMIO write to a physical
>>>>>> address would do it, but it adds hypervisor knowledge in virtio, including
>>>>>> CPUID handling.
>>>>>
>>>>> Another issue is that it will break DPDK on virtio.
>>>>
>>>> Not break, just make it slower.
>>>
>>> I thought hypercalls can only be triggered from ring 0, userspace can't call them.
>>> Dod I get it wrong?
>>
>> That's just a limitation that KVM makes on currently-defined hypercalls.
>>
>> VMCALL causes a vmexit if executed from ring 3.
>>
>> Paolo
>