Re: [PATCH -next] arch_topology: Fix cache attributes detection in the CPU hotplug path

From: Conor.Dooley
Date: Thu Jul 14 2022 - 11:27:18 EST


On 14/07/2022 16:01, Sudeep Holla wrote:
> On Thu, Jul 14, 2022 at 02:17:33PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
>> On 13/07/2022 14:33, Sudeep Holla wrote:
>>
>> Hey Sudeep,
>> I could not get this patch to actually apply, tried a couple
>> different versions of -next :/
>>
>
> That's strange.
>
>> It is in -next already though, which I suspect might be part of why
>> it does not apply..
>
> Ah that could be the case.
>
>> Surely you can fast forward your arch_topology
>> for-next branch to gregs merge commit rather than generating this
>> from the premerge branch & re-merging into your branch that Stephen
>> picks up?
>>
>
> Greg has merged my branch and all those commits are untouched, so it shouldn't
> cause any issue as the hash remains same in both the trees, I just added just
> this one patch on the top. Did you see any issues with the merge, or are you
> just speculating based on your understanding.

Speculating based on it being a "could not construct ancestor" error.


>>
>> Actually, we are now worse off than before:
>> 0.009813] smp: Bringing up secondary CPUs ...
>> [ 0.011530] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
>> [ 0.011550] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
>> [ 0.011566] preempt_count: 1, expected: 0
>> [ 0.011580] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.19.0-rc6-next-20220714-dirty #1
>> [ 0.011599] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
>> [ 0.011608] Call Trace:
>> [ 0.011620] [<ffffffff80005070>] dump_backtrace+0x1c/0x24
>> [ 0.011661] [<ffffffff8066b0c4>] show_stack+0x2c/0x38
>> [ 0.011699] [<ffffffff806704a2>] dump_stack_lvl+0x40/0x58
>> [ 0.011725] [<ffffffff806704ce>] dump_stack+0x14/0x1c
>> [ 0.011745] [<ffffffff8002f42a>] __might_resched+0x100/0x10a
>> [ 0.011772] [<ffffffff8002f472>] __might_sleep+0x3e/0x66
>> [ 0.011793] [<ffffffff8014d774>] __kmalloc+0xd6/0x224
>> [ 0.011825] [<ffffffff803d631c>] detect_cache_attributes+0x37a/0x448
>> [ 0.011855] [<ffffffff803e8fbe>] update_siblings_masks+0x24/0x246
>> [ 0.011885] [<ffffffff80005f32>] smp_callin+0x38/0x5c
>> [ 0.015990] smp: Brought up 1 node, 4 CPUs
>>
>
> Interesting, need to check if it is not in atomic context on arm64.
> Wonder if some configs are disabled and making this bug hidden. Let me
> check.
>
> One possible solution is to add GFP_ATOMIC to the allocation but I want
> to make sure if it is legal to be in atomic context when calling
> update_siblings_masks.
>
>>>
>>> Anyways give this a try, also test the CPU hotplug and check if nothing
>>> is broken on RISC-V. We noticed this bug only on one platform while
>>
>> So, our system monitor that runs openSBI does not actually support
>> any hotplug features yet, so:
>
> OK, we can ignore hotplug on RISC-V for now then. We have tested on multiple
> arm64 platforms(DT as well as ACPI).
>

Well, other vendors implementations of firmware-come-bootloaders-
running-openSBI may support it, but (currently) ours does not.
But, if no-one else is speaking up about this, my arch-topo changes
or your original patchset...