Re: KASLR causes intermittent boot failures on some systems

From: Dan Williams
Date: Mon Apr 24 2017 - 19:18:57 EST


On Mon, Apr 24, 2017 at 4:07 PM, Baoquan He <bhe@xxxxxxxxxx> wrote:
> On 04/24/17 at 01:52pm, Dan Williams wrote:
[..]
>> When using the memmap= parameter we're using this call by default:
>>
>> } else if (pmem_should_map_pages(dev)) {
>> addr = devm_memremap_pages(dev, &nsio->res,
>> &q->q_usage_counter, NULL);
>> pmem->pfn_flags |= PFN_MAP;
>> } else
>>
>> ...where we are assuming that the memmap= parameter does not specify a
>> range-size that will exhaust all of system-memory just to hold the
>> struct page array.
>
> Yeah, according to my debugging tracking, it goes as Dan said. And the
> is_ram is REGION_DISJOINT. And till arch_add_memory, the parameters
> passed to arch_add_memory are "arch_add_memory, align_start:0x10000000000, align_size:0x3000000000",
> seems it's going well.
>
> Hi Dan,
>
> I am always confused that in devm_memremap_pages, the passed in
> parameter altmap is NULL, while it used devres_alloc_node to allocate a
> page_map and that page_map contained a altmap instance, not pointer.
> Then the addr range were inserted into pgmap_radix with value of
> page_map. Why later in __add_pages, to_vmem_altmap() return NULL
> according to my debugging code?

We expect altmap to always be NULL in this case. The only time it is
not NULL is when the namespace is configured to allocate the struct
page array from capacity on the namespace itself. I.e. instead of
allocating struct page from page allocator pages the driver creates an
altmap and vmemmap_populate_hugepages() uses that to allocate the
array from "alternate" capacity.

You can force this by running:

ndctl create-namespace -f -e namespace0.0 -m memory -M dev

...which says "put the struct page map on the device".