Re: [PATCH RFC 1/2] RDMA/rxe: don't allow registering !PAGE_SIZE mr

From: Greg Sword
Date: Tue Oct 31 2023 - 20:58:43 EST


On Tue, Oct 31, 2023 at 9:19 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>
> On Tue, Oct 31, 2023 at 04:52:23PM +0800, Zhu Yanjun wrote:
> > 在 2023/10/30 20:40, Jason Gunthorpe 写道:
> > > On Mon, Oct 30, 2023 at 07:51:41AM +0000, Zhijian Li (Fujitsu) wrote:
> > > >
> > > >
> > > > On 27/10/2023 13:41, Li Zhijian wrote:
> > > > > mr->page_list only encodes *page without page offset, when
> > > > > page_size != PAGE_SIZE, we cannot restore the address with a wrong
> > > > > page_offset.
> > > > >
> > > > > Note that this patch will break some ULPs that try to register 4K
> > > > > MR when PAGE_SIZE is not 4K.
> > > > > SRP and nvme over RXE is known to be impacted.
> > > > >
> > > > > Signed-off-by: Li Zhijian <lizhijian@xxxxxxxxxxx>
> > > > > ---
> > > > > drivers/infiniband/sw/rxe/rxe_mr.c | 6 ++++++
> > > > > 1 file changed, 6 insertions(+)
> > > > >
> > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > > index f54042e9aeb2..61a136ea1d91 100644
> > > > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > > @@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
> > > > > struct rxe_mr *mr = to_rmr(ibmr);
> > > > > unsigned int page_size = mr_page_size(mr);
> > > > > + if (page_size != PAGE_SIZE) {
> > > >
> > > > It seems this condition is too strict, it should be:
> > > > if (!IS_ALIGNED(page_size, PAGE_SIZE))
> > > >
> > > > So that, page_size with (N * PAGE_SIZE) can work as previously.
> > > > Because the offset(mr.iova & page_mask) will get lost only when !IS_ALIGNED(page_size, PAGE_SIZE)
> > >
> > > That makes sense
> >
> > I read all the discussions very carefully.
> >
> > Thanks, Greg.
> >
> > Because RXE only supports PAGE_SIZE, when CONFIG_ARM64_64K_PAGES is enabled,
> > the PAGE_SIZE is 64K, when CONFIG_ARM64_64K_PAGES is disabled, PAGE_SIZE is
> > 4K.
> >
> > But NVMe calls ib_map_mr_sg with a fixed size SZ_4K. When
> > CONFIG_ARM64_64K_PAGES is enabled, it is still 4K. This is not a problem in
> > RXE. This problem is in NVMe.
>
> Maybe, but no real RDMA devices don't support 4K.
>
> The xarray conversion may need revision to use physical addresses
> instead of storing struct pages so it can handle this kind of
> segmentation.

This problem can not be fixed until rxe supports multiple page sizes
including 4K page size.
Now it is not fixed. It is an intermediate way.

>
> Certainly in the mean time it should be rejected.
>
> Jason