Re: [RFC][PATCH 1/7][memcg] virtually indexed array library.

From: Andrew Morton
Date: Wed Jul 28 2010 - 15:45:48 EST


On Tue, 27 Jul 2010 16:53:03 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
>
> This virt-array allocates a virtally contiguous array via get_vm_area()
> and allows object allocation per an element of array.
> Physical pages are used only for used items in the array.
>
> - At first, the user has to create an array by create_virt_array().
> - At using an element, virt_array_alloc_index(index) should be called.
> - At freeing an element, virt_array_free_index(index) should be called.
> - At destroying, destroy_virt_array() should be called.
>
> Item used/unused status is controlled by bitmap and back-end physical
> pages are automatically allocated/freed. This is useful when you
> want to access objects by index in light weight. For example,
>
> create_virt_array(va);
> struct your_struct *objmap = va->address;
> Then, you can access your objects by objmap[i].
>
> In usual case, holding reference by index rather than pointer can save memory.
> But index -> object lookup cost cannot be negligible. In such case,
> this virt-array may be helpful. Ah yes, if lookup performance is not important,
> using radix-tree will be better (from TLB point of view). This virty-array
> may consume VMALLOC area too much. and alloc/free routine is very slow.
>
> Changelog:
> - fixed bugs in bitmap ops.
> - add offset for find_free_index.
>

My gut reaction to this sort of thing is "run away in terror". It
encourages kernel developers to operate like lackadaisical userspace
developers and to assume that underlying code can perform heroic and
immortal feats. But it can't. This is the kernel and the kernel is a
tough and hostile place and callers should be careful and defensive and
take great efforts to minimise the strain they put upon other systems.

IOW, can we avoid doing this?

>
> ...
>
> +void free_varray_item(struct virt_array *v, int idx)
> +{
> + mutex_lock(&v->mutex);
> + __free_unmap_entry(v, idx);
> + mutex_unlock(&v->mutex);
> +}

It's generally a bad idea for library code to perform its own locking.
In this case we've just made this whole facility inaccessible to code
which runs from interrupt or atomic contexts.

> + pg[0] = alloc_page(GFP_KERNEL);

And hard-wiring GFP_KERNEL makes this facility inaccessible to GFP_NOIO
and GFP_NOFS contexts as well.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/