RE: [EXTERNAL] Re: [PATCH 05/12] net: mana: Set the DMA device max page size

From: Ajay Sharma
Date: Wed May 18 2022 - 17:06:45 EST


Sorry , I am not able to follow. Below is the reference efa driver implementation :

static int efa_device_init(struct efa_com_dev *edev, struct pci_dev *pdev)
{
int dma_width;
int err;

err = efa_com_dev_reset(edev, EFA_REGS_RESET_NORMAL);
if (err)
return err;

err = efa_com_validate_version(edev);
if (err)
return err;

dma_width = efa_com_get_dma_width(edev);
if (dma_width < 0) {
err = dma_width;
return err;
}

err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(dma_width));
if (err) {
dev_err(&pdev->dev, "dma_set_mask_and_coherent failed %d\n", err);
return err;
}

dma_set_max_seg_size(&pdev->dev, UINT_MAX);
return 0;
}

static int efa_register_mr(struct ib_pd *ibpd, struct efa_mr *mr, u64 start,
u64 length, u64 virt_addr, int access_flags)
{
struct efa_dev *dev = to_edev(ibpd->device);
struct efa_com_reg_mr_params params = {};
struct efa_com_reg_mr_result result = {};
struct pbl_context pbl;
unsigned int pg_sz;
int inline_size;
int err;

params.pd = to_epd(ibpd)->pdn;
params.iova = virt_addr;
params.mr_length_in_bytes = length;
params.permissions = access_flags;

pg_sz = ib_umem_find_best_pgsz(mr->umem,
dev->dev_attr.page_size_cap,
virt_addr);
....
}

Ideally we would like to read it from HW, but currently we are hardcoding the bitmap. I can change the commit message if you feel that is misleading .
Something along the lines :
RDMA/mana: Use API to get contiguous memory blocks aligned to device supported page size

Use the ib_umem_find_best_pgsz() and rdma_for_each_block() API when
registering an MR instead of coding it in the driver.

ib_umem_find_best_pgsz() is used to find the best suitable page size
which replaces the existing efa_cont_pages() implementation.
rdma_for_each_block() is used to iterate the umem in aligned contiguous memory blocks.


Ajay


-----Original Message-----
From: Jason Gunthorpe <jgg@xxxxxxxx>
Sent: Wednesday, May 18, 2022 9:05 AM
To: Ajay Sharma <sharmaajay@xxxxxxxxxxxxx>
Cc: Long Li <longli@xxxxxxxxxxxxx>; KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>; Wei Liu <wei.liu@xxxxxxxxxx>; Dexuan Cui <decui@xxxxxxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Paolo Abeni <pabeni@xxxxxxxxxx>; Leon Romanovsky <leon@xxxxxxxxxx>; linux-hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
Subject: Re: [EXTERNAL] Re: [PATCH 05/12] net: mana: Set the DMA device max page size

[You don't often get email from jgg@xxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification.]

On Wed, May 18, 2022 at 05:59:00AM +0000, Ajay Sharma wrote:
> Thanks Long.
> Hello Jason,
> I am the author of the patch.
> To your comment below :
> " As I've already said, you are supposed to set the value that limits to ib_sge and *NOT* the value that is related to ib_umem_find_best_pgsz. It is usually 2G because the ib_sge's typically work on a 32 bit length."
>
> The ib_sge is limited by the __sg_alloc_table_from_pages() which uses
> ib_dma_max_seg_size() which is what is set by the eth driver using
> dma_set_max_seg_size() . Currently our hw does not support PTEs larger
> than 2M.

*sigh* again it has nothing to do with *PTEs* in the HW.

Jason