Re: [PATCH v5 1/3] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

From: Frank Rowand
Date: Fri Mar 09 2018 - 20:21:00 EST


On 03/09/18 15:03, Rob Herring wrote:
> On Sun, Mar 04, 2018 at 04:14:47PM -0800, frowand.list@xxxxxxxxx wrote:
>> From: Frank Rowand <frank.rowand@xxxxxxxx>
>>
>> Create a cache of the nodes that contain a phandle property. Use this
>> cache to find the node for a given phandle value instead of scanning
>> the devicetree to find the node. If the phandle value is not found
>> in the cache, of_find_node_by_phandle() will fall back to the tree
>> scan algorithm.
>>
>> The cache is initialized in of_core_init().
>>
>> The cache is freed via a late_initcall_sync() if modules are not
>> enabled.
>>
>> If the devicetree is created by the dtc compiler, with all phandle
>> property values auto generated, then the size required by the cache
>> could be 4 * (1 + number of phandles) bytes. This results in an O(1)
>> node lookup cost for a given phandle value. Due to a concern that the
>> phandle property values might not be consistent with what is generated
>> by the dtc compiler, a mask has been added to the cache lookup algorithm.
>> To maintain the O(1) node lookup cost, the size of the cache has been
>> increased by rounding the number of entries up to the next power of
>> two.
>>
>> The overhead of finding the devicetree node containing a given phandle
>> value has been noted by several people in the recent past, in some cases
>> with a patch to add a hashed index of devicetree nodes, based on the
>> phandle value of the node. One concern with this approach is the extra
>> space added to each node. This patch takes advantage of the phandle
>> property values auto generated by the dtc compiler, which begin with
>> one and monotonically increase by one, resulting in a range of 1..n
>> for n phandle values. This implementation should also provide a good
>> reduction of overhead for any range of phandle values that are mostly
>> in a monotonic range.
>>
>> Performance measurements by Chintan Pandya <cpandya@xxxxxxxxxxxxxx>
>> of several implementations of patches that are similar to this one
>> suggest an expected reduction of boot time by ~400ms for his test
>> system. If the cache size was decreased to 64 entries, the boot
>> time was reduced by ~340 ms. The measurements were on a 4.9.73 kernel
>> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
>> 814 phandle values.
>>
>> Reported-by: Chintan Pandya <cpandya@xxxxxxxxxxxxxx>
>> Signed-off-by: Frank Rowand <frank.rowand@xxxxxxxx>
>> ---
>
> I've applied this one, but not the others.
>
> Rob
>

Thank you. I'll let the other two disappear into a distant memory.

-Frank