Re: [PATCH v2] SR-IOV: correct broken resource alignmentcalculations

From: Jesse Barnes
Date: Sun Aug 30 2009 - 11:39:26 EST


On Fri, 28 Aug 2009 13:00:06 -0700
Chris Wright <chrisw@xxxxxxxxxxxx> wrote:

> * Matthew Wilcox (matthew@xxxxxx) wrote:
> > On Fri, Aug 28, 2009 at 12:17:14PM -0700, Chris Wright wrote:
> > > This patch adds a support for a new resource alignment type,
> > > IORESOURCE_VSIZEALIGN, and allows struct resource to keep track
> > > of the size requirements of a VF BAR which are smaller than the
> > > full resource size. This could also be done all within the PCI
> > > layer w/out bloating struct resource or using the last available
> > > bit for alignment types.
> >
> > Yes, I think that would be preferable. We have a *LOT* of
> > resources in the kernel, and the embedded folks would not find it
> > funny if they all grew in size suddenly.
>
> An SR-IOV capable device includes an SR-IOV PCIe capability which
> describes the Virtual Function (VF) BAR requirements. A typical
> SR-IOV device can support multiple VFs whose BARs must be in a
> contiguous region, effectively an array of VF BARs. The BAR reports
> the size requirement for a single VF. We calculate the full range
> needed by simply multiplying the VF BAR size with the number of
> possible VFs and create a resource spanning the full range.
>
> This all seems sane enough except it artificially inflates the
> alignment requirement for the VF BAR. The VF BAR need only be
> aligned to the size of a single BAR not the contiguous range of VF
> BARs. This can cause us to fail to allocate resources for the BAR
> despite the fact that we actually have enough space.
>
> This patch adds a thin PCI specific layer over the generic
> resource_alignment() function which is aware of the special nature of
> VF BARs and does sorting and allocation based on the smaller alignment
> requirement.
>
> I recognize that while resource_alignment is generic, it's basically a
> PCI helper. An alternative to this patch is to add PCI VF BAR
> specific information to struct resource. I opted for the extra layer
> rather than adding such PCI specific information to struct resource.
> This does have the slight downside that we don't cache the BAR size
> and re-read for each alignment query (happens a small handful of
> times during boot for each VF BAR).
>
> Signed-off-by: Chris Wright <chrisw@xxxxxxxxxxxx>
> Cc: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
> Cc: Ivan Kokshaysky <ink@xxxxxxxxxxxxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Matthew Wilcox <matthew@xxxxxx>
> Cc: Yu Zhao <yu.zhao@xxxxxxxxx>
> Cc: stable@xxxxxxxxxx

Yeah, I like this one better. I've applied it to my for-linus branch;
would be nice to have a Tested-by for it before I send it to Linus...

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/