Re: [PATCH 17/21] Percpu tag allocator

From: Oleg Nesterov
Date: Wed May 15 2013 - 11:44:50 EST


On 05/15, Kent Overstreet wrote:
>
> On Tue, May 14, 2013 at 03:48:59PM +0200, Oleg Nesterov wrote:
> > tag_free() does
> >
> > list_del_init(wait->list);
> > /* WINDOW */
> > wake_up_process(wait->task);
> >
> > in theory the caller of tag_alloc() can notice list_empty_careful(),
> > return without taking pool->lock, exit, and free this task_struct.
> >
> > But the main problem is that it is not clear why this code reimplements
> > add_wait_queue/wake_up_all, for what?
>
> To save on locking... there's really no point in another lock for the
> wait queue. Could just use the wait queue lock instead I suppose, like
> wait_event_interruptible_locked()

Yes. Or perhaps you can reuse wait_queue_head_t->lock for move_tags().

And,

> (the extra spin_lock()/unlock() might not really cost anything but
> nested irqsave()/restore() is ridiculously expensive, IME).

But this is the slow path anyway. Even if you do not use _locked, how
much this extra locking (save/restore) can make the things worse?

In any case, I believe it would be much better to reuse the code we
already have, to avoid the races and make the code more understandable.
And to not bloat the code.

Do you really think that, say,

unsigned tag_alloc(struct tag_pool *pool, bool wait)
{
struct tag_cpu_freelist *tags;
unsigned ret = 0;
retry:
tags = get_cpu_ptr(pool->tag_cpu);
local_irq_disable();
if (!tags->nr_free && pool->nr_free) {
spin_lock(&pool->wq.lock);
if (pool->nr_free)
move_tags(...);
spin_unlock(&pool->wq.lock);
}

if (tags->nr_free)
ret = tags->free[--tags->nr_free];
local_irq_enable();
put_cpu_var(pool->tag_cpu);

if (ret || !wait)
return ret;

__wait_event(&pool->wq, pool->nr_free);
goto retry;
}

will be much slower?

> > I must admit, I do not understand what this code actually does ;)
> > I didn't try to read it carefully though, but perhaps at least the
> > changelog could explain more?
>
> The changelog is admittedly terse, but that's basically all there is to
> it -
> [...snip...]

Yes, thanks for your explanation, I already realized what it does...

Question. tag_free() does move_tags+wakeup if nr_free = pool->watermark * 2.
Perhaps it should should also take waitqueue_active() into account ?
tag_alloc() can sleep more than necessary, it seems.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/