Re: [RFC] NUMA : could we introduce virt_to_nid() ?

From: Eric Dumazet
Date: Fri Mar 23 2007 - 12:22:30 EST


On Fri, 23 Mar 2007 07:50:28 -0700 (PDT)
Christoph Lameter <christoph@xxxxxxxxxxx> wrote:

> On Fri, 23 Mar 2007, Eric Dumazet wrote:
>
> > Checking Christoph quicklist implementation, I found the same cache miss in free() than SLAB has.
> >
> > /* common implementation *
> > int virt_to_nid(const void *addr)
> > {
> > struct page *page = virt_to_page(addr);
> > return page_to_nid(page);
> > }
> >
> > On some platforms (x86_64 for example), could we have a better
> > implementation, not accessing struct page, but using phys_to_nid() ?
>
> This is going to pollute the caches since we then use multiple ways to
> determine the node of a page. Its better to stay with the same approach
> for all pages. The page struct is used for many other purposes as well.
> Its likely to be in the cpu cache.
>

Sorry ? page structs are not in cpu cache at all.

phys_to_nid() on my 16 GB x86_64 machine uses one single cache line, shared by all pages.
This single cache line is cache hot yes, not "struct page"s ...

And on this machine, thats about 224 Mbytes of 'struct pages'

You carefully commented your alloc() function saying it is touching two cache lines.
But you omited to say that free() function needs 3 cache lines if CONFIG_NUMA

For SLAB use, page struct is needed because we use lru.{next|prev} to store slab/cachep pointers, but for a pure page allocator, unless I misread your patch, we dont need it, if virt_to_nid() can do its job without it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/