Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized

From: Rafael J. Wysocki
Date: Fri Jul 22 2022 - 13:19:01 EST


On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@xxxxxxxxxx> wrote:
>
> There's a space allocated for cooling_device_stats_attr_group
> within cooling_device_attr_groups. This space is shared by all
> cooling devices.

That's correct.

> If the stats structure of one cooling device successfully
> creates stats sysfs. After that, another cooling device fails
> to get max_states in cooling_device_stats_setup(). It can
> return directly without initializing the stats structure, but
> the cooling_device_stats_attr_group is still the attribute
> group of the last cooling device.

I cannot parse the above, sorry.

For example, how can a "stats structure of one cooling device" create
anything? As a data structure, it is a passive entity, so it doesn't
carry out any actions.

I think (but I am not sure) that you are referring to the error code
path in which the ->get_max_state() callback fails for a cooling
device after cooling_device_stats_setup() has completed successfully
for another one.

> At this time, read or write stats sysfs nodes can cause kernel
> crash. Like the following, kernel crashed when
> 'cat time_in_state_ms'.
>
> [<5baac8d4>] panic+0x1b4/0x3c8
> [<9d287b0f>] arm_notify_die+0x0/0x78
> [<094fc22c>] __do_kernel_fault+0x94/0xa4
> [<3b4b69a4>] do_page_fault+0xd4/0x364
> [<23793e7a>] do_translation_fault+0x38/0xc0
> [<6e5cc52a>] do_DataAbort+0x4c/0xd0
> [<a28c16b8>] __dabt_svc+0x5c/0xa0
> [<747516ae>] _raw_spin_lock+0x20/0x60
> [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148
> [<cb78325e>] dev_attr_show+0x38/0x64
> [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0
> [<c0a843ab>] seq_read+0x244/0x620
> [<b316b374>] vfs_read+0xd8/0x218
> [<3aebf5fa>] sys_read+0x80/0xe4
> [<7cf100f5>] ret_fast_syscall+0x0/0x28
> [<08cbe22f>] 0xbe8c1198
>
> stats sysfs:
> phone:/sys/class/thermal/cooling_device2/stats # ls
> reset time_in_state_ms total_trans trans_table
>
> The same as cat total_trans, trans_table, and echo reset.
>
> To avoid kernel crash, this patch set clears the
> cooling_device_attr_groups before stats structure is initialized.
>
> Signed-off-by: Di Shen <di.shen@xxxxxxxxxx>
> ---
> drivers/thermal/thermal_sysfs.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
> index 1c4aac8464a7..e3fae63fa0f7 100644
> --- a/drivers/thermal/thermal_sysfs.c
> +++ b/drivers/thermal/thermal_sysfs.c
> @@ -817,6 +817,9 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
> unsigned long states;
> int var;
>
> + var = ARRAY_SIZE(cooling_device_attr_groups) - 2;
> + cooling_device_attr_groups[var] = NULL;
> +
> if (cdev->ops->get_max_state(cdev, &states))
> return;
>
> --