Re: [PATCH v2] tcp: splice as many packets as possible at once

From: Nick Piggin
Date: Wed Feb 04 2009 - 04:24:39 EST


On Wednesday 04 February 2009 19:08:51 Evgeniy Polyakov wrote:
> On Tue, Feb 03, 2009 at 04:46:09PM -0800, David Miller (davem@xxxxxxxxxxxxx)
wrote:
> > > NTA tried to solve this by not allowing to free the data allocated on
> > > the different CPU, contrary to what SLAB does. Modulo cache coherency
> > > improvements,
> >
> > This could kill performance on NUMA systems if we are not careful.
> >
> > If we ever consider NTA seriously, these issues would need to
> > be performance tested.
>
> Quite contrary I think. Memory is allocated and freed on the same CPU,
> which means on the same memory domain, closest to the CPU in question.
>
> I did not test NUMA though, but NTA performance on the usual CPU (it is
> 2.5 years old already :) was noticebly good.

I had a quick look at NTA... I didn't understand much of it yet, but
the remote freeing scheme is kind of like what I did for slqb. The
freeing CPU queues objects back to the CPU that allocated them, which
eventually checks the queue and frees them itself.

I don't know how much cache coherency gains you get from this -- in
most slab allocations, I think the object tends to be cache on on the
CPU that frees it. I'm doing it mainly to try avoid locking... I guess
that makes for cache coherency benefit itself.

If NTA does significantly better than slab allocator, I would be quite
interested. It might be something that we can learn from and use in
the general slab allocator (or maybe something more network specific
that NTA does).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/