Re: [PATCH 1/2] mm/dmapool: replace linked list with xarray

From: Matthew Wilcox
Date: Thu Apr 28 2022 - 18:00:00 EST


On Thu, Apr 28, 2022 at 02:27:13PM -0600, kbusch@xxxxxxxxxx wrote:
> Store the cached dma pool pages in an xarray instead of a linked list.
> This allows constant lookup time to free the page, lockless iteration
> while displaying the attributes, and frees up space in 'struct dma_page'.

Hey Keith, this looks great, especially since there's more performance
you can squeeze out of it.

> struct dma_pool { /* the pool */
> - struct list_head page_list;
> + struct xarray pages;
> spinlock_t lock;

A further optimisation you could make is to use the xa_lock to protect
the rest of the data structure. But that would be a subsequent patch.

> @@ -282,7 +281,8 @@ void dma_pool_destroy(struct dma_pool *pool)
> device_remove_file(pool->dev, &dev_attr_pools);
> mutex_unlock(&pools_reg_lock);
>
> - list_for_each_entry_safe(page, tmp, &pool->page_list, page_list) {
> + xa_for_each(&pool->pages, i, page) {
> + xa_erase(&pool->pages, i);
> if (is_page_busy(page)) {
> if (pool->dev)
> dev_err(pool->dev, "%s %s, %p busy\n", __func__,
> @@ -291,12 +291,12 @@ void dma_pool_destroy(struct dma_pool *pool)
> pr_err("%s %s, %p busy\n", __func__,
> pool->name, page->vaddr);
> /* leak the still-in-use consistent memory */
> - list_del(&page->page_list);
> kfree(page);
> } else
> pool_free_page(pool, page);
> }
>
> + xa_destroy(&pool->pages);

If you're erasing the entries as you go, you don't need to call
xa_destroy(). Contrariwise, if you call xa_destroy(), you don't need to
call xa_erase(). I'd probably just call xa_destroy() at the end as it's
less work.

> @@ -316,13 +316,14 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
> {
> unsigned long flags;
> struct dma_page *page;
> + unsigned long i;
> size_t offset;
> void *retval;
>
> might_alloc(mem_flags);
>
> spin_lock_irqsave(&pool->lock, flags);
> - list_for_each_entry(page, &pool->page_list, page_list) {
> + xa_for_each(&pool->pages, i, page) {
> if (page->offset < pool->allocation)
> goto ready;
> }

A further optimisation you could do is use xarray search marks to
search for only pages which have free entries.

> + xa_store(&pool->pages, page->vaddr, page, mem_flags);

Oof. The XArray isn't good at such sparse allocations. You can improve
it (by a significant amount) by shifting the vaddr by PAGE_SHIFT bits.
Should save you two nodes of tree height and thus two cache lines per
lookup.