Pat,
Whilst looking at the Summit NUMA support I believe I have found a bug
in the memory hole handling. Specifically, there appears to be a type
mismatch between get_zholes_size() returning a single long and
free_area_init_core() requiring a log array. What I cannot adequately
explain is why this does not lead to a panic during boot.
Attached is a patch against 2.5.59-mjb6 which I believe should correct
this. It has been tested in isolation and compile tested, but as I
don't have access to a test machine I cannot be sure it works. I
believe some investigation is needed to understand why this bug does not
prevent booting, or lead to a large disparity in the zone free page
counts, perhaps the e820 map is helping here.
[gory details for the interested]
Under NUMA support constructing the memory map we call
free_area_init_node() to initilialise the pglist_data and allocate the
memory map structures. As part of this we supply a per node, per memory
zone page count and a per node, per memory zone missing page count.
These are used in free_area_init_core() to determine the true number of
pages per node, per zone. In the existing summit code we parse the SRAT
in order to locate and size the inter-chunk gaps, on a per node basis.
Later this is queried via get_zholes_size() from zone_init_sizes().
Unfortuantly, get_zholes_size is returning a single long representing
the per node total holes, whilst zone_init_sizes() requires an array of
longs one per zone (long[MAX_NR_ZONES]). In the zero holes case this
will be safe as if there are zero pages of hole then we pass an
apparently null pointer to get_zholes_size which is interpreted as
having no holes. If the presence of any such holes a low-memory
reference would be passed potentially leading to an oops.
The attached patch modifies the memory chunk hole scan such that each
hole is allocated to one or more zones using the calculated zone
boundries, converting zholes_size[] from a per node count to a per node,
per zone count in a similar form to the associated zones[] array.
Cheers.
-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sun Feb 23 2003 - 22:00:23 EST