Re: [GIT PATCH] x86,percpu: fix pageattr handling with remap allocator

From: Tejun Heo
Date: Sat May 16 2009 - 21:24:48 EST


Hello, Suresh.

Suresh Siddha wrote:
> On Sat, 2009-05-16 at 08:16 -0700, Tejun Heo wrote:
>> Hello, Suresh.
>>
>> Suresh Siddha wrote:
>>> Tejun, Can you please educate me why we need to map this first
>>> percpu chunk (which is pre-allocated during boot and is physically
>>> contiguous) into vmalloc area?
>> To make areas for each cpu congruent such that the address offset of a
>> percpu symbol for CPU N is always the same from the address for CPU 0.
>
> But for the first percpu chunk, isn't it the case that the physical
> address allocations for a particular cpu is contiguous (as you are using
> one bootmem allocation for whole PMD_SIZE for any given cpu)? So both
> the kernel direct mapping aswell as the vmalloc mappings are contiguous
> for the first chunk, on any given cpu. Right?

Hmmm... okay. Percpu areas are composed of multiple chunks. A single
chunk is composed of multiple units, one unit for each CPU. Units in
a single chunk should be contiguous and of the same size such that
unit_addr_for_cpu_N == chunk_addr + N * unit_size whereas chunks don't
need to have any special address relation to other chunks. Combined,
this results in percpu addresses for CPU N are always offset by N *
unit_size from the percpu adresses for CPU 0 which can be efficiently
determined using some extra resource in the processor (segment
register on x86 for example).

For remap first chunk allocator, each unit for each CPU is allocated
separately using the bootmem allocator. Each unit is continuous but
they still need to be assembed into a single contiguous area to be
used as the first chunk, which is where the remapping comes in. So,
the extra requirement is that units in the same chunk need to be
contiguous and NUMA allocation means units will be spread according to
NUMA configuration, so they need to be put together by remapping them.

>>> Perhaps even for the other dynamically allocated secondary chunks?
>>> (as far as I can see, all the chunk allocations seems to be
>>> physically contiguous and later mapped into vmalloc area)..
>>>
>>> That should simplify these things quite a bit(atleast for first
>>> percpu chunk). I am missing something obvious I guess.
>> Hmm... Sorry I don't really follow. Can you please elaborate the
>> question?
>
> For the first percpu chunk, we can use the kernel direct mapping and
> avoid the vmalloc mapping of PMD_SIZE. And avoid the vmap address
> aliasing problem (wrt to free pages that we have given back to -mm) that
> we are trying to avoid with this patchset (as the existing cpa code
> already takes care of the kernel direct mappings).

Hmmm.... If you can show me how to use the linear mapping directly,
I'll be happy as a clam.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/