RE: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

From: PINTU KUMAR
Date: Tue Oct 07 2014 - 12:07:54 EST


----- Original Message -----
> From: Colin Cross <ccross@xxxxxxxxxxx>
> To: pintu.k@xxxxxxxxxxx
> Cc: Laura Abbott <lauraa@xxxxxxxxxxxxxx>; Heesub Shin <heesub.shin@xxxxxxxxxxx>; "akpm@xxxxxxxxxxxxxxxxxxxx" <akpm@xxxxxxxxxxxxxxxxxxxx>; "gregkh@xxxxxxxxxxxxxxxxxxx" <gregkh@xxxxxxxxxxxxxxxxxxx>; "john.stultz@xxxxxxxxxx" <john.stultz@xxxxxxxxxx>; "rebecca@xxxxxxxxxxx" <rebecca@xxxxxxxxxxx>; "devel@xxxxxxxxxxxxxxxxxxxx" <devel@xxxxxxxxxxxxxxxxxxxx>; "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>; IQBAL SHAREEF <iqbal.ams@xxxxxxxxxxx>; "pintu_agarwal@xxxxxxxxx" <pintu_agarwal@xxxxxxxxx>; Vishnu Pratap Singh <vishnu.ps@xxxxxxxxxxx>; "cpgs@xxxxxxxxxxx" <cpgs@xxxxxxxxxxx>
> Sent: Monday, 6 October 2014 11:01 PM
> Subject: Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order
>
> On Mon, Oct 6, 2014 at 9:26 AM, PINTU KUMAR <pintu.k@xxxxxxxxxxx> wrote:
>
>>
>> Hi,
>> >________________________________
>> > From: Laura Abbott <lauraa@xxxxxxxxxxxxxx>
>> >To: Heesub Shin <heesub.shin@xxxxxxxxxxx>; Pintu Kumar
> <pintu.k@xxxxxxxxxxx>; akpm@xxxxxxxxxxxxxxxxxxxx;
> gregkh@xxxxxxxxxxxxxxxxxxx; john.stultz@xxxxxxxxxx; rebecca@xxxxxxxxxxx;
> ccross@xxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
>> >Cc: iqbal.ams@xxxxxxxxxxx; pintu_agarwal@xxxxxxxxx;
> vishnu.ps@xxxxxxxxxxx
>> >Sent: Monday, 6 October 2014 7:37 PM
>> >Subject: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER
> for high order
>> >
>> >
>> >On 10/6/2014 3:27 AM, Heesub Shin wrote:
>> >
>> >
>> >
>> >
>> >> Hello Kumar,
>> >>
>> >> On 10/06/2014 05:31 PM, Pintu Kumar wrote:
>> >>> The Android ion_system_heap uses allocation fallback mechanism
>> >>> based on 8,4,0 order pages available in the system.
>> >>> It changes gfp flags based on higher order allocation request.
>> >>> This higher order value is hard-coded as 4, instead of using
>> >>> the system defined higher order value.
>> >>> Thus replacing this hard-coded value with
> PAGE_ALLOC_COSTLY_ORDER
>> >>> which is defined as 3.
>> >>> This will help mapping the higher order request in system heap
> with
>> >>> the actual allocation request.
>> >>
>> >> Quite reasonable.
>> >>
>> >> Reviewed-by: Heesub Shin <heesub.shin@xxxxxxxxxxx>
>> >>
>> >> BTW, Anyone knows how the allocation order (8,4 and 0) was
> decided? I
>> >> think only Google guys might know the answer.
>> >>
>> >> regards,
>> >> heesub
>> >>
>> >
>> >My understanding was this was completely unrelated to the costly order
>> >and was related to the page sizes corresponding to IOMMU page sizes
>> >(1MB, 64K, 4K). This won't make a difference for the uncached page
>> >pool case but for the not page pool case, I'm not sure if there
> would
>> >be a benefit for trying to get 32K pages with some effort vs. just
>> >going back to 4K pages.
>>
>> No, it is not just related to IOMMU case. It comes into picture also for
>> normal system-heap allocation (without iommu cases).
>> Also, it is applicable for both uncached and page_pool cases.
>> Please also check the changes under ion_system_heap_create.
>> Here the gfp_flags are set under the pool structure.
>> This value is used in ion_page_pool_alloc_pages.
>> In both the cases, it internally calls alloc_pages, with this gfp_flags.
>> Now, during memory pressure scenario, when alloc_pages moves to slowpath
>> this gfp_flags will be used to decide allocation retry.
>> In the current code, the higher-order flag is set only when order is
> greater than 4.
>> But, in MM, the order 4 is also considered as higher-order request.
>> This higher-order is decided based on PAGE_ALLOC_COSTLY_ORDER (3) value.
>> Hence, I think this value should be in sync with the MM code.
>> >
>> >Do you have any data/metrics that show a benefit from this patch?
>> I think it is not related to any data or metrics.
>> It is about replacing the hard-coded higher-order check to be in sync with
>> the MM code.
>>
>
> The selection of the orders used for allocation (8, then 4, then 0) is
> designed to match with the sizes often found in IOMMUs, but this isn't
> changing the order of the allocation, it is changing the GFP flags
> used for the order 4 allocation. Right now we are using the
> low_order_gfp_flags for order 4, this patch would change it to use
> high_order_gfp_flags. We originally used low_order_gfp_flags here
> because the MM subsystem can usually satisfy these allocations, and
> the additional load placed on the MM subsystem to kick off kswapd to
> free up more order 4 chunks is generally worth it. Using order 4
> pages instead of order 0 pages can significantly improve the
> performance of many IOMMUs by reducing TLB pressure and time spent
> updating page tables. Unless you have data showing that this improves
> something, and doesn't just cause all allocations to be order 0 when
> under memory pressure, I don't suggest merging this.
>

Ok agree. It is worth retrying the allocation with order-4 pages.
But, since 4 is considered higher order for MM and is greater than PAGE_ALLOC_COSTLY_ORDER.
I guess the retrying will not happen, because of the following check in page_alloc:
if (order > PAGE_ALLOC_COSTLY_ORDER)
goto nopage;



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/