RE: [PATCH 5.8 298/464] PCI: hv: Fix a timing issue which causes kdump to fail occasionally

From: Michael Kelley
Date: Mon Aug 17 2020 - 19:27:58 EST


From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Sent: Monday, August 17, 2020 8:14 AM
>
> From: Wei Hu <weh@xxxxxxxxxxxxx>
>
> [ Upstream commit d6af2ed29c7c1c311b96dac989dcb991e90ee195 ]
>
> Kdump could fail sometime on Hyper-V guest because the retry in
> hv_pci_enter_d0() releases child device structures in hv_pci_bus_exit().
>
> Although there is a second asynchronous device relations message sending
> from the host, if this message arrives to the guest after
> hv_send_resource_allocated() is called, the retry would fail.
>
> Fix the problem by moving retry to hv_pci_probe() and start the retry
> from hv_pci_query_relations() call. This will cause a device relations
> message to arrive to the guest synchronously; the guest would then be
> able to rebuild the child device structures before calling
> hv_send_resource_allocated().
>
> Link:
> https://lore.kernel.org/linux-hyperv/20200727071731.18516-1-weh@xxxxxxxxxxxxx/
> Fixes: c81992e7f4aa ("PCI: hv: Retry PCI bus D0 entry on invalid device state")
> Signed-off-by: Wei Hu <weh@xxxxxxxxxxxxx>
> [lorenzo.pieralisi@xxxxxxx: fixed a comment and commit log]
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
> Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> ---
> drivers/pci/controller/pci-hyperv.c | 71 +++++++++++++++--------------
> 1 file changed, 37 insertions(+), 34 deletions(-)
>

Greg --

Don't backport this patch to 5.8 and earlier. It doesn't break anything,
but it doesn't fully accomplish what was intended either. As such it will
probably need a revision in 5.9. Wei Hu is unavailable for a few days
for personal reasons, so I'm jumping in here on his behalf.

Michael