Re: [PATCH 2/3] mm/vmalloc: occupy newly allocated vmap block just after allocation

From: Joonsoo Kim
Date: Wed Mar 18 2015 - 01:51:42 EST


2015-03-13 21:12 GMT+09:00 Roman Pen <r.peniaev@xxxxxxxxx>:
> Previous implementation allocates new vmap block and repeats search of a free
> block from the very beginning, iterating over the CPU free list.
>
> Why it can be better??
>
> 1. Allocation can happen on one CPU, but search can be done on another CPU.
> In worst case we preallocate amount of vmap blocks which is equal to
> CPU number on the system.
>
> 2. In previous patch I added newly allocated block to the tail of free list
> to avoid soon exhaustion of virtual space and give a chance to occupy
> blocks which were allocated long time ago. Thus to find newly allocated
> block all the search sequence should be repeated, seems it is not efficient.
>
> In this patch newly allocated block is occupied right away, address of virtual
> space is returned to the caller, so there is no any need to repeat the search
> sequence, allocation job is done.
>
> Signed-off-by: Roman Pen <r.peniaev@xxxxxxxxx>
> Cc: Nick Piggin <npiggin@xxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> Cc: WANG Chao <chaowang@xxxxxxxxxx>
> Cc: Fabian Frederick <fabf@xxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>
> Cc: Gioh Kim <gioh.kim@xxxxxxx>
> Cc: Rob Jones <rob.jones@xxxxxxxxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> ---
> mm/vmalloc.c | 57 ++++++++++++++++++++++++++++++++++++---------------------
> 1 file changed, 36 insertions(+), 21 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index db6bffb..9759c92 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -791,13 +791,30 @@ static unsigned long addr_to_vb_idx(unsigned long addr)
> return addr;
> }
>
> -static struct vmap_block *new_vmap_block(gfp_t gfp_mask)
> +static void *vmap_block_vaddr(unsigned long va_start, unsigned long pages_off)
> +{
> + unsigned long addr = va_start + (pages_off << PAGE_SHIFT);
> + BUG_ON(addr_to_vb_idx(addr) != addr_to_vb_idx(va_start));

Need one blank line between above two lines.
Please run script/checkpatch.pl.

> + return (void *)addr;
> +}
> +
> +/**
> + * new_vmap_block - allocates new vmap_block and occupies 2^order pages in this
> + * block. Of course pages number can't exceed VMAP_BBMAP_BITS
> + * @order: how many 2^order pages should be occupied in newly allocated block
> + * @gfp_mask: flags for the page level allocator
> + * @addr: output virtual address of a newly allocator block
> + *
> + * Returns: address of virtual space in a block or ERR_PTR
> + */
> +static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
> {
> struct vmap_block_queue *vbq;
> struct vmap_block *vb;
> struct vmap_area *va;
> unsigned long vb_idx;
> int node, err;
> + void *vaddr;
>
> node = numa_node_id();
>
> @@ -821,9 +838,12 @@ static struct vmap_block *new_vmap_block(gfp_t gfp_mask)
> return ERR_PTR(err);
> }
>
> + vaddr = vmap_block_vaddr(va->va_start, 0);
> spin_lock_init(&vb->lock);
> vb->va = va;
> - vb->free = VMAP_BBMAP_BITS;
> + /* At least something should be left free */
> + BUG_ON(VMAP_BBMAP_BITS <= (1UL << order));
> + vb->free = VMAP_BBMAP_BITS - (1UL << order);
> vb->dirty = 0;
> bitmap_zero(vb->dirty_map, VMAP_BBMAP_BITS);
> INIT_LIST_HEAD(&vb->free_list);
> @@ -841,7 +861,7 @@ static struct vmap_block *new_vmap_block(gfp_t gfp_mask)
> spin_unlock(&vbq->lock);
> put_cpu_var(vmap_block_queue);
>
> - return vb;
> + return vaddr;
> }
>
> static void free_vmap_block(struct vmap_block *vb)
> @@ -905,7 +925,7 @@ static void *vb_alloc(unsigned long size, gfp_t gfp_mask)
> {
> struct vmap_block_queue *vbq;
> struct vmap_block *vb;
> - unsigned long addr = 0;
> + void *vaddr = NULL;
> unsigned int order;
>
> BUG_ON(size & ~PAGE_MASK);
> @@ -920,43 +940,38 @@ static void *vb_alloc(unsigned long size, gfp_t gfp_mask)
> }
> order = get_order(size);
>
> -again:
> rcu_read_lock();
> vbq = &get_cpu_var(vmap_block_queue);
> list_for_each_entry_rcu(vb, &vbq->free, free_list) {
> - int i;
> + unsigned long pages_nr;

I think that pages_off is better. Is there any reason to use this naming?

Anyway, patch looks okay to me.

Acked-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/