Re: [PATCH] irqchip: gicv3-its: Use NUMA aware memory allocation for ITS tables

From: Marc Zyngier
Date: Mon Jul 10 2017 - 05:06:21 EST


On 10/07/17 09:48, Ganapatrao Kulkarni wrote:
> Hi Marc,
>
> On Mon, Jul 3, 2017 at 8:23 PM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
>> Hi Shanker,
>>
>> On 03/07/17 15:24, Shanker Donthineni wrote:
>>> Hi Marc,
>>>
>>> On 06/30/2017 03:51 AM, Marc Zyngier wrote:
>>>> On 30/06/17 04:01, Ganapatrao Kulkarni wrote:
>>>>> On Fri, Jun 30, 2017 at 8:04 AM, Ganapatrao Kulkarni
>>>>> <gpkulkarni@xxxxxxxxx> wrote:
>>>>>> Hi Shanker,
>>>>>>
>>>>>> On Sun, Jun 25, 2017 at 9:16 PM, Shanker Donthineni
>>>>>> <shankerd@xxxxxxxxxxxxxx> wrote:
>>>>>>> The NUMA node information is visible to ITS driver but not being used
>>>>>>> other than handling errata. This patch allocates the memory for ITS
>>>>>>> tables from the corresponding NUMA node using the appropriate NUMA
>>>>>>> aware functions.
>>>>>
>>>>> IMHO, the description would have been more constructive?
>>>>>
>>>>> "All ITS tables are mapped by default to NODE 0 memory.
>>>>> Adding changes to allocate memory from respective NUMA NODES of ITS devices.
>>>>> This will optimize tables access and avoids unnecessary inter-node traffic."
>>>>
>>>> But more importantly, I'd like to see figures showing the actual benefit
>>>> of this per-node allocation. Given that both of you guys have access to
>>>> such platforms, please show me the numbers!
>>>>
>>>
>>> I'll share the actual results which shows the improvement whenever
>>> available on our next chips. Current version of Qualcomm qdf2400 doesn't
>>> support multi socket configuration to capture results and share with you.
>>>
>>> Do you see any other issues with this patch apart from the performance
>>> improvements. I strongly believe this brings the noticeable improvement
>>> in numbers on systems where it has multi node memory/CPU configuration.
>>
>> I agree that it *could* show an improvement, but it very much depends on
>> how often the ITS misses in its caches. For this kind of patches, I want
>> to see two things:
>>
>> 1) It brings a measurable benefit on NUMA platforms
>
> Did some measurement of interrupt response time for LPIs and we don't
> see any major
> improvement due to caching of Tables. However, we have seen
> improvements of around 5%.

An improvement of what exactly?

M.
--
Jazz is not dead. It just smells funny...