Re: [RFC PATCH V5] mm readahead: Fix readahead fail for no local memory and limit readahead pages

From: Nishanth Aravamudan
Date: Fri Feb 14 2014 - 00:47:43 EST


On 13.02.2014 [14:41:04 -0800], David Rientjes wrote:
> On Thu, 13 Feb 2014, Raghavendra K T wrote:
>
> > Thanks David, unfortunately even after applying that patch, I do not see
> > the improvement.
> >
> > Interestingly numa_mem_id() seem to still return the value of a
> > memoryless node.
> > May be per cpu _numa_mem_ values are not set properly. Need to dig out ....
> >
>
> I believe ppc will be relying on __build_all_zonelists() to set
> numa_mem_id() to be the proper node, and that relies on the ordering of
> the zonelist built for the memoryless node. It would be very strange if
> local_memory_node() is returning a memoryless node because it is the first
> zone for node_zonelist(GFP_KERNEL) (why would a memoryless node be on the
> zonelist at all?).
>
> I think the real problem is that build_all_zonelists() is only called at
> init when the boot cpu is online so it's only setting numa_mem_id()
> properly for the boot cpu. Does it return a node with memory if you
> toggle /proc/sys/vm/numa_zonelist_order? Do
>
> echo node > /proc/sys/vm/numa_zonelist_order
> echo zone > /proc/sys/vm/numa_zonelist_order
> echo default > /proc/sys/vm/numa_zonelist_order
>
> and check if it returns the proper value at either point. This will force
> build_all_zonelists() and numa_mem_id() to point to the proper node since
> all cpus are now online.

Yep, after massaging the code to allow CONFIG_USE_PERCPU_NUMA_NODE_ID,
you're right that the memory node is wrong. The cpu node is right (they
are all on node 0), but that could be lucky. The memory node is right
for the boot cpu. I did notice that some CPUs now think the cpu node is
1, which is wrong.

> So the prerequisite for CONFIG_HAVE_MEMORYLESS_NODES is that there is an
> arch-specific set_numa_mem() that makes this mapping correct like ia64
> does. If that's the case, then it's (1) completely undocumented and (2)
> Nishanth's patch is incomplete because anything that adds
> CONFIG_HAVE_MEMORYLESS_NODES needs to do the proper set_numa_mem() for it
> to be any different than numa_node_id().

I'll work on getting the set_numa_mem() and set_numa_node() correct for
powerpc.

Thanks,
Nish

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/