Re: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a document describing Alveo XRT drivers

From: Moritz Fischer
Date: Thu Dec 03 2020 - 23:18:57 EST


On Fri, Dec 04, 2020 at 01:17:37AM +0000, Max Zhen wrote:
> Hi Moritz,
>
> I manually fixed some line breaks. Not sure why outlook is not doing it properly.
> Let me know if it still looks bad to you.

That might just be outlook :)
>
> Please see my reply below.
>
> >
> >
> > Max,
> >
> > On Thu, Dec 03, 2020 at 03:38:26AM +0000, Max Zhen wrote:
> > > [...cut...]
> > >
> > > > > > > +xclbin over the User partition as part of DFX. When a user
> > > > > > > +requests loading of a specific xclbin the xmgmt management
> > > > > > > +driver reads the parent interface UUID specified in the xclbin
> > > > > > > +and matches it with child interface UUID exported by Shell to
> > > > > > > +determine if xclbin is compatible with the Shell. If match fails loading of xclbin is denied.
> > > > > > > +
> > > > > > > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl command.
> > > > > > > +When loading xclbin xmgmt driver performs the following operations:
> > > > > > > +
> > > > > > > +1. Sanity check the xclbin contents 2. Isolate the User
> > > > > > > +partition 3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > > > > > > +De-isolate the User partition
> > > > > > Is this modelled as bridges and regions?
> > > > >
> > > > > Alveo drivers as written today do not use fpga bridge and region
> > > > > framework. It seems that if we add support for that framework, it’s
> > > > > possible to receive PR program request from kernel outside of xmgmt driver?
> > > > > Currently, we can’t support this and PR program can only be initiated
> > > > > using XRT’s runtime API in user space.
> > > >
> > > > I'm not 100% sure I understand the concern here, let me reply to what
> > > > I think I understand:
> > > >
> > > > You're worried that if you use FPGA region as interface to accept PR
> > > > requests something else could attempt to reconfigure the region from
> > > > within the kernel using the FPGA Region API?
> > > >
> > > > Assuming I got this right, I don't think this is a big deal. When you
> > > > create the regions you control who gets the references to it.
> > >
> > > Thanks for explaining. Yes, I think you got my point :-).
> >
> > We can add code to make a region 'static' or 'one-time' or 'fixed'.
> > >
> > > >
> > > > From what I've seen so far Regions seem to be roughly equivalent to
> > > > Partitions, hence my surprise to see a new structure bypassing them.
> > >
> > > I see where the gap is.
> > >
> > > Regions in Linux is very different than "partitions" we have defined in xmgmt. Regions seem to be a software data structure
> > > representing an area on the FPGA that can be reprogrammed. This area is protected by the concept of "bridge" which can be disabled
> > > before program and reenabled after it. And you go through region when you need to reprogram this area.
> >
> > Your central management driver can create / destroy regions at will. It
> > can keep them in a list, array or tree.
> >
> > Regions can but don't have to have bridges.
> >
> > If you need to go through the central driver to reprogram a region,
> > you can use that to figure out which region to program.
>
> That sounds fine. I can create a region and call into it from xmgmt for
> PR programing. The region will, then, call the xmgmt's fpga manager
> to program it.

It sounds closer than what I'd expect.
>
> > >
> > > The "partition" is part of the main infrastructure of xmgmt driver, which represents a group of subdev drivers for each individual IP
> > > (HW subcomponents). Basically, xmgmt root driver is parent of several partitions who is, in turn, the parent of several subdev drivers.
> > > The parent manages the life cycle of its children here.
> >
> > I don't see how this is conceptually different from what DFL does, and
> > they managed to use Regions and Bridges.
> >
> > If things are missing in the framework, please add them instead of
> > rewriting an entire parallel framework.
> >
> > >
> > > We do have a partition to represent the group of subdevs/IPs in the reprogrammable area. And we also have partitions
> > > representing other areas which cannot be reprogrammed. So, it is difficult to use "Region" to implement "partition".
> >
> > You implement your regions callbacks, you can return -EINVAL / -ENOTTY
> > if you want to fail a reprogramming request to a static partion /
> > region.
> >
> > > From what you have explained, it seems that even if I use region / bridge in xmgmt, we can still keep it private to xmgmt instead of
> > > exposing the interface to outside world, which we can't support anyway? This means that region will be used as an internal data
> > > structure for xmgmt. Since we can't simply replace partition with region, we might as well just use partition throughout the driver,
> > > instead of introducing two data structures and use them both in different places.
> >
> > Think about your partition as an extension to a region that implements
> > what you need to do for your case of enumerating and reprogramming that
> > particular piece of your chip.
>
> Yes, we can add region / bridges to represent the PR area and use it in our
> code path for reprogramming the PR area. I think what we will do is to
> instantiate a region instance for the PR area and associate it with the
> FPGA manager in xmgmt for reprogramming it. We can also instantiate
> bridges and map the "ULP gate" subdev driver to it in xmgmt. Thus, we
> could incorporate region and bridge data structures in xmgmt for PR
> reprogramming.

I'd need to take another look, but the ULP gate sounds like a bridge (or
close to it).

> This will be a non-trivial change for us. I'd like to confirm that this is what
> you are looking for before we start working on the change. Let us know :-).

I understand. It looks like the right direction. Let's discuss code when
we have code to look at.

It may take a couple of iterations to get it all sorted.

That's normal when you show show up with that much code all at once :)

Cheers,
Moritz