Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline

From: Qian Cai
Date: Fri Apr 26 2019 - 11:55:17 EST


On Fri, 2019-04-26 at 17:26 +0200, Joerg Roedel wrote:
> On Fri, Apr 26, 2019 at 10:52:28AM -0400, Qian Cai wrote:
> > Applying some memory pressure would causes smartpqi offline even in today's
> > linux-next. This can always be reproduced by a LTP test cases [1] or
> > sometimes
> > just compiling kernels.
> >
> > Reverting the commit "iommu/amd: Set exclusion range correctly"Âfixed the
> > issue.
> >
> > [ÂÂ213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x0000 address=0x1000 flags=0x0000]
> > [ÂÂ213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x0000 address=0x1800 flags=0x0000]
> > [ÂÂ233.362013] smartpqi 0000:23:00.0: controller is offline: status code
> > 0x14803
> > [ÂÂ233.369359] smartpqi 0000:23:00.0: controller offline
> > [ÂÂ233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags
> > 2000001
> > [ÂÂ233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result:
> > hostbyte=0x01
> > driverbyte=0x00
> > [ÂÂ233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00
> > 00 01
> > 08 00
> > [ÂÂ233.389003] Write-error on swap-device (254:1:4474640)
> > [ÂÂ233.389015] Write-error on swap-device (254:1:2190776)
> > [ÂÂ233.389023] Write-error on swap-device (254:1:8351936)
> >
> > [1] /opt/ltp/testcases/bin/mtest01 -p80 -w
>
> I can't explain that, can you please boot with 'amd_iommu_dump' on the
> kernel command line and send me dmesg after boot?

https://git.sr.ht/~cai/linux-debug/blob/master/dmesg