Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

From: Benjamin Herrenschmidt
Date: Tue Aug 09 2016 - 20:06:24 EST


On Tue, 2016-08-09 at 11:12 -0700, Alexander Duyck wrote:
>
> The PCI spec is what essentially assumes that there is only one block.
> If I am not mistaken in the case of this device the second block here
> actually contains device configuration data, not actual VPD data. The
> issue here is that the second block is being accessed as VPD when it
> isn't.

Devices do funny things with config space, film at 11. VFIO trying to
be the middle man and intercept/interpret things is broken, cannot work,
will never work, will just results in lots and lots of useless code, but
I've been singing that song for too long and nobody seems to care...

> > > #0000 Large item 42 bytes; name 0x2 Identifier String
> > #002d Large item 74 bytes; name 0x10
> > #007a Small item 1 bytes; name 0xf End Tag
> > ---
> > #0c00 Large item 16 bytes; name 0x2 Identifier String
> > #0c13 Large item 234 bytes; name 0x10
> > #0d00 Large item 252 bytes; name 0x11
> > #0dff Small item 0 bytes; name 0xf End Tag
>
> The second block here is driver proprietary setup bits.

Right. They happen to be in VPD on this device. They an be elsewhere on
other devices. In between capabilities on some, in vendor caps on others...

> > > The cxgb3 driver is reading the second bit starting from 0xc00 but since
> > the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
> > guest driver fails to probe.
> >
> > I also cannot find a clause in the PCI 3.0 spec saying that there must be
> > just a single block, is it there?
>
> > The problem is we need to be able to parse it.

We can parse the standard part for generic stuff like inventory tools
or lsvpd, but we shouldn't get in the way of the driver poking at its
device.

> Â The spec defines a
> series of tags that can be used starting at offset 0. That is how we
> are supposed to get around through the VPD data. The problem is we
> can't have more than one end tag and what appears to be happening here
> is that we are defining a second block of data which uses the same
> formatting as VPD but is not VPD.
>
> > What would the correct fix be? Scanning all 32k of VPD is not an option I
> > suppose as this is what this patch is trying to avoid. Thanks.
>
> I adding the current cxgb3 maintainer and netdev list to the Cc. This
> is something that can probably be addressed via a PCI quirk as what
> needs to happen is that we need to extend the VPD in the case of this
> part in order to include this second block. As long as we can read
> the VPD data all the way out to 0xdff odds are we could probably just
> have the size arbitrarily increased to 0xe00 via the quirk and then
> you would be able to access all of the VPD for the device. We already
> have code making other modifications to drivers/pci/quirks.c for
> several Broadcom devices and probably just need something similar to
> allow extended access in the case of these devices.


> > >
> >
> >
> > This is the device:
> >
> > > [aik@p81-p9 ~]$ sudo lspci -vvnns 0001:03:00.0
> > 0001:03:00.0 Ethernet controller [0200]: Chelsio Communications Inc T310
> > 10GbE Single Port Adapter [1425:0030]
> >ÂÂÂÂÂÂÂÂ Subsystem: IBM Device [1014:038c]
> >ÂÂÂÂÂÂÂÂ Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > Stepping- SERR- FastB2B- DisINTx+
> >ÂÂÂÂÂÂÂÂ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> > <MAbort- >SERR- <PERR- INTx-
> >ÂÂÂÂÂÂÂÂ Latency: 0
> >ÂÂÂÂÂÂÂÂ Interrupt: pin A routed to IRQ 494
> >ÂÂÂÂÂÂÂÂ Region 0: Memory at 3fe080880000 (64-bit, non-prefetchable) [size=4K]
> >ÂÂÂÂÂÂÂÂ Region 2: Memory at 3fe080000000 (64-bit, non-prefetchable) [size=8M]
> >ÂÂÂÂÂÂÂÂ Region 4: Memory at 3fe080881000 (64-bit, non-prefetchable) [size=4K]
> >ÂÂÂÂÂÂÂÂ [virtual] Expansion ROM at 3fe080800000 [disabled] [size=512K]
> >ÂÂÂÂÂÂÂÂ Capabilities: [40] Power Management version 3
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> >ÂÂÂÂÂÂÂÂ Capabilities: [48] MSI: Enable- Count=1/32 Maskable- 64bit+
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Address: 0000000000000000Â Data: 0000
> >ÂÂÂÂÂÂÂÂ Capabilities: [58] Express (v2) Endpoint, MSI 00
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ MaxPayload 256 bytes, MaxReadReq 512 bytes
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s
> > unlimited, L1 unlimited
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt-
> > ABWMgmt-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ DevCap2: Completion Timeout: Range ABC, TimeoutDis-, LTR-, OBFF Not Supported
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Transmit Margin: Normal Operating Range, EnterModifiedCompliance-
> > ComplianceSOS-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Compliance De-emphasis: -6dB
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-,
> > EqualizationPhase1-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> >ÂÂÂÂÂÂÂÂ Capabilities: [94] Vital Product Data
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Product Name: 10 Gigabit Ethernet-SR PCI Express Adapter
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Read-only fields:
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [EC] Engineering changes: D76809
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [FN] Unknown: 34 36 4b 37 38 39 37
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [PN] Part number: 46K7897
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [MN] Manufacture ID: 31 30 33 37
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [FC] Unknown: 35 37 36 39
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [SN] Serial number: YL102035603V
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [NA] Unknown: 30 30 31 34 35 45 39 39 32 45 44 31
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ End
> >ÂÂÂÂÂÂÂÂ Capabilities: [9c] MSI-X: Enable+ Count=32 Masked-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Vector table: BAR=4 offset=00000000
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ PBA: BAR=4 offset=00000800
> >ÂÂÂÂÂÂÂÂ Capabilities: [100 v1] Device Serial Number 00-00-00-01-00-00-00-01
> >ÂÂÂÂÂÂÂÂ Capabilities: [300 v1] Advanced Error Reporting
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ UESta:Â DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> > ECRC- UnsupReq- ACSViol-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ UEMsk:Â DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> > ECRC- UnsupReq- ACSViol-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
> > ECRC- UnsupReq- ACSViol-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CESta:Â RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CEMsk:Â RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> >ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> >ÂÂÂÂÂÂÂÂ Kernel driver in use: cxgb3
> >ÂÂÂÂÂÂÂÂ Kernel modules: cxgb3
> >
> >