Re: [PATCH v2 0/3] KVM: x86: KVM_MEM_PCI_HOLE memory

From: Vitaly Kuznetsov
Date: Wed Sep 02 2020 - 04:59:34 EST


Peter Xu <peterx@xxxxxxxxxx> writes:

> On Tue, Sep 01, 2020 at 04:43:25PM +0200, Vitaly Kuznetsov wrote:
>> Peter Xu <peterx@xxxxxxxxxx> writes:
>>
>> > On Fri, Aug 07, 2020 at 04:12:29PM +0200, Vitaly Kuznetsov wrote:
>> >> When testing Linux kernel boot with QEMU q35 VM and direct kernel boot
>> >> I observed 8193 accesses to PCI hole memory. When such exit is handled
>> >> in KVM without exiting to userspace, it takes roughly 0.000001 sec.
>> >> Handling the same exit in userspace is six times slower (0.000006 sec) so
>> >> the overal; difference is 0.04 sec. This may be significant for 'microvm'
>> >> ideas.
>> >
>> > Sorry to comment so late, but just curious... have you looked at what's those
>> > 8000+ accesses to PCI holes and what they're used for? What I can think of are
>> > some port IO reads (e.g. upon vendor ID field) during BIOS to scan the devices
>> > attached. Though those should be far less than 8000+, and those should also be
>> > pio rather than mmio.
>>
>> And sorry for replying late)
>>
>> We explicitly want MMIO instead of PIO to speed things up, afaiu PIO
>> requires two exits per device (and we exit all the way to
>> QEMU). Julia/Michael know better about the size of the space.
>>
>> >
>> > If this is only an overhead for virt (since baremetal mmios should be fast),
>> > I'm also thinking whether we can make it even better to skip those pci hole
>> > reads. Because we know we're virt, so it also gives us possibility that we may
>> > provide those information in a better way than reading PCI holes in the guest?
>>
>> This means let's invent a PV interface and if we decide to go down this
>> road, I'd even argue for abandoning PCI completely. E.g. we can do
>> something similar to Hyper-V's Vmbus.
>
> My whole point was more about trying to understand the problem behind.
> Providing a fast path for reading pci holes seems to be reasonable as is,
> however it's just that I'm confused on why there're so many reads on the pci
> holes after all. Another important question is I'm wondering how this series
> will finally help the use case of microvm. I'm not sure I get the whole point
> of it, but... if microvm is the major use case of this, it would be good to
> provide some quick numbers on those if possible.
>
> For example, IIUC microvm uses qboot (as a better alternative than seabios) for
> fast boot, and qboot has:
>
> https://github.com/bonzini/qboot/blob/master/pci.c#L20
>
> I'm kind of curious whether qboot will still be used when this series is used
> with microvm VMs? Since those are still at least PIO based.

I'm afraid there is no 'grand plan' for everything at this moment :-(
For traditional VMs 0.04 sec per boot is negligible and definitely not
worth adding a feature, memory requirements are also very
different. When it comes to microvm-style usage things change.

'8193' PCI hole accesses I mention in the PATCH0 blurb are just from
Linux as I was doing direct kernel boot, we can't get better than that
(if PCI is in the game of course). Firmware (qboot, seabios,...) can
only add more. I *think* the plan is to eventually switch them all to
MMCFG, at least for KVM guests, by default but we need something to put
to the advertisement.

We can, in theory, short circuit PIO in KVM instead but:
- We will need a complete different API
- We will never be able to reach the speed of the exit-less 'single 0xff
page' solution (see my RFC).

--
Vitaly