Re: [RFC PATCH] KVM: arm/arm64: Enable direct irqfd MSI injection

From: Marc Zyngier
Date: Tue Mar 19 2019 - 12:58:05 EST


On Tue, 19 Mar 2019 15:59:00 +0000,
Zenghui Yu <yuzenghui@xxxxxxxxxx> wrote:
>
> Hi Marc,
>
> On 2019/3/19 18:01, Marc Zyngier wrote:
> > On Tue, 19 Mar 2019 09:09:43 +0800
> > Zenghui Yu <yuzenghui@xxxxxxxxxx> wrote:
> >
> >> Hi all,
> >>
> >> On 2019/3/18 3:35, Marc Zyngier wrote:
> >>> A first approach would be to keep a small cache of the last few
> >>> successful translations for this ITS, cache that could be looked-up by
> >>> holding a spinlock instead. A hit in this cache could directly be
> >>> injected. Any command that invalidates or changes anything (DISCARD,
> >>> INV, INVALL, MAPC with V=0, MAPD with V=0, MOVALL, MOVI) should nuke
> >>> the cache altogether.
> >>>
> >>> Of course, all of that needs to be quantified.
> >>
> >> Thanks for all of your explanations, especially for Marc's suggestions!
> >> It took me long time to figure out my mistakes, since I am not very
> >> familiar with the locking stuff. Now I have to apologize for my noise.
> >
> > No need to apologize. The whole point of this list is to have
> > discussions. Although your approach wasn't working, you did
> > identify potential room for improvement.
> >
> >> As for the its-translation-cache code (a really good news to us), we
> >> have a rough look at it and start testing now!
> >
> > Please let me know about your findings. My initial test doesn't show
> > any improvement, but that could easily be attributed to the system I
> > running this on (a tiny and slightly broken dual A53 system). The sizing
> > of the cache is also important: too small, and you have the overhead of
> > the lookup for no benefit; too big, and you waste memory.
>
> Not smoothly as expected. With below config (in the form of XML):

The good news is that nothing was expected at all.

> ---8<---
> <interface type='vhostuser'>
> <source type='unix' path='/var/run/vhost-user/tap_0' mode='client'/>
> <model type='virtio'/>
> <driver name='vhost' queues='32' vringbuf='4096'/>
> </interface>
> ---8<---

Sorry, I don't read XML, and I have zero idea what this represent.

>
> VM can't even get to boot successfully!
>
>
> Kernel version is -stable 4.19.28. And *dmesg* on host shows:

Please don't test on any other thing but mainline. The only thing I'm
interested in at the moment is 5.1-rc1.

>
> ---8<---
> [ 507.908330] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 507.908338] rcu: 35-...0: (0 ticks this GP)
> idle=d06/1/0x4000000000000000 softirq=72150/72150 fqs=6269
> [ 507.908341] rcu: 41-...0: (0 ticks this GP)
> idle=dee/1/0x4000000000000000 softirq=68144/68144 fqs=6269
> [ 507.908342] rcu: (detected by 23, t=15002 jiffies, g=68929, q=408641)
> [ 507.908350] Task dump for CPU 35:
> [ 507.908351] qemu-kvm R running task 0 66789 1
> 0x00000002
> [ 507.908354] Call trace:
> [ 507.908360] __switch_to+0x94/0xe8
> [ 507.908363] _cond_resched+0x24/0x68
> [ 507.908366] __flush_work+0x58/0x280
> [ 507.908369] free_unref_page_commit+0xc4/0x198
> [ 507.908370] free_unref_page+0x84/0xa0
> [ 507.908371] __free_pages+0x58/0x68
> [ 507.908372] free_pages.part.21+0x34/0x40
> [ 507.908373] free_pages+0x2c/0x38
> [ 507.908375] poll_freewait+0xa8/0xd0
> [ 507.908377] do_sys_poll+0x3d0/0x560
> [ 507.908378] __arm64_sys_ppoll+0x180/0x1e8
> [ 507.908380] 0xa48990
> [ 507.908381] Task dump for CPU 41:
> [ 507.908382] kworker/41:1 R running task 0 647 2
> 0x0000002a
> [ 507.908387] Workqueue: events irqfd_inject
> [ 507.908389] Call trace:
> [ 507.908391] __switch_to+0x94/0xe8
> [ 507.908392] 0x200000131
> [... ...]
> [ 687.928330] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 687.928339] rcu: 35-...0: (0 ticks this GP)
> idle=d06/1/0x4000000000000000 softirq=72150/72150 fqs=25034
> [ 687.928341] rcu: 41-...0: (0 ticks this GP)
> idle=dee/1/0x4000000000000000 softirq=68144/68144 fqs=25034
> [ 687.928343] rcu: (detected by 16, t=60007 jiffies, g=68929,
> q=1601093)
> [ 687.928351] Task dump for CPU 35:
> [ 687.928352] qemu-kvm R running task 0 66789 1
> 0x00000002
> [ 687.928355] Call trace:
> [ 687.928360] __switch_to+0x94/0xe8
> [ 687.928364] _cond_resched+0x24/0x68
> [ 687.928367] __flush_work+0x58/0x280
> [ 687.928369] free_unref_page_commit+0xc4/0x198
> [ 687.928370] free_unref_page+0x84/0xa0
> [ 687.928372] __free_pages+0x58/0x68
> [ 687.928373] free_pages.part.21+0x34/0x40
> [ 687.928374] free_pages+0x2c/0x38
> [ 687.928376] poll_freewait+0xa8/0xd0
> [ 687.928378] do_sys_poll+0x3d0/0x560
> [ 687.928379] __arm64_sys_ppoll+0x180/0x1e8
> [ 687.928381] 0xa48990
> [ 687.928382] Task dump for CPU 41:
> [ 687.928383] kworker/41:1 R running task 0 647 2
> 0x0000002a
> [ 687.928389] Workqueue: events irqfd_inject
> [ 687.928391] Call trace:
> [ 687.928392] __switch_to+0x94/0xe8
> [ 687.928394] 0x200000131
> [...]
> ---8<--- endlessly ...
>
> It seems that we've suffered from some locking related issues. Any
> suggestions for debugging?

None at the moment. And this doesn't seem quite related to the problem
at hand, does it?

> And could you please provide your test steps ? So that I can run
> some tests on my HW to see improvement hopefully.

Here you go:

qemu-system-aarch64 -m 512M -smp 2 -cpu host,aarch64=on -machine virt,accel=kvm,gic_version=3,its -nographic -drive if=pflash,format=raw,readonly,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive if=pflash,format=raw,file=buster/GXnkZdHqG4e7o4pC.fd -netdev tap,fds=128:129,id=hostnet0,vhost=on,vhostfds=130:131 -device virtio-net-pci,mac=5a:fe:00:e5:b1:30,netdev=hostnet0,mq=on,vectors=6 -drive if=none,format=raw,file=buster/GXnkZdHqG4e7o4pC.img,id=disk0 -device virtio-blk-pci,drive=disk0 -drive file=debian-testing-arm64-DVD-1-preseed.iso,id=cdrom,if=none,media=cdrom -device virtio-scsi-pci -device scsi-cd,drive=cdrom 128<>/dev/tap7 129<>/dev/tap7 130<>/dev/vhost-net 131<>/dev/vhost-net

> > Having thought about it a bit more, I think we can drop the
> > invalidation on MOVI/MOVALL, as the LPI is still perfectly valid, and
> > we don't cache the target vcpu. On the other hand, the cache must be
> > nuked when the ITS is turned off.
>
> All of these are valuable. But it might be early for me to consider
> about them (I have to get the above problem solved first ...)

I'm not asking you to consider them. I jumped in this thread
explaining what could be done instead. These are ideas on top of what
I've already offered.

M.

--
Jazz is not dead, it just smell funny.