Re: [PATCH v13 4/7] arm64: mte: Enable TCO in functions that can read beyond buffer limits

From: Vincenzo Frascino
Date: Mon Feb 22 2021 - 07:05:59 EST




On 2/12/21 5:21 PM, Catalin Marinas wrote:
>> +
>> + /*
>> + * This function is called on each active smp core at boot
>> + * time, hence we do not need to take cpu_hotplug_lock again.
>> + */
>> + static_branch_enable_cpuslocked(&mte_async_mode);
>> }
> Sorry, I missed the cpuslocked aspect before. Is there any reason you
> need to use this API here? I suggested to add it to the
> mte_enable_kernel_sync() because kasan may at some point do this
> dynamically at run-time, so the boot-time argument doesn't hold. But
> it's also incorrect as this function will be called for hot-plugged
> CPUs as well after boot.
>
> The only reason for static_branch_*_cpuslocked() is if it's called from
> a region that already invoked cpus_read_lock() which I don't think is
> the case here.

I agree with your analysis on why static_branch_*_cpuslocked() is needed, in
fact cpus_read_lock() takes cpu_hotplug_lock as per comment on top of the line
of code.

If I try to take that lock when enabling the secondary cores I end up in the
situation below:

[ 0.283402] smp: Bringing up secondary CPUs ...
....
[ 5.890963] Call trace:
[ 5.891050] dump_backtrace+0x0/0x19c
[ 5.891212] show_stack+0x18/0x70
[ 5.891373] dump_stack+0xd0/0x12c
[ 5.891531] dequeue_task_idle+0x28/0x40
[ 5.891686] __schedule+0x45c/0x6c0
[ 5.891851] schedule+0x70/0x104
[ 5.892010] percpu_rwsem_wait+0xe8/0x104
[ 5.892174] __percpu_down_read+0x5c/0x90
[ 5.892332] percpu_down_read.constprop.0+0xbc/0xd4
[ 5.892497] cpus_read_lock+0x10/0x1c
[ 5.892660] static_key_enable+0x18/0x3c
[ 5.892823] mte_enable_kernel_async+0x40/0x70
[ 5.892988] kasan_init_hw_tags_cpu+0x50/0x60
[ 5.893144] cpu_enable_mte+0x24/0x70
[ 5.893304] verify_local_cpu_caps+0x58/0x120
[ 5.893465] check_local_cpu_capabilities+0x18/0x1f0
[ 5.893626] secondary_start_kernel+0xe0/0x190
[ 5.893790] 0x0
[ 5.893975] bad: scheduling from the idle thread!
[ 5.894065] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W
5.11.0-rc7-10587-g22cd50bcfcf-dirty #6

and the kernel panics.

Note: there is a look of msg drop in between enabling the secondary and the
first clean stack trace.

--
Regards,
Vincenzo