Re: [PATCH v3 10/14] remoteproc: Deal with synchronisation when shutting down

From: Arnaud POULIQUEN
Date: Wed May 06 2020 - 03:51:35 EST




On 5/6/20 12:03 AM, Mathieu Poirier wrote:
> On Mon, May 04, 2020 at 01:34:43PM +0200, Arnaud POULIQUEN wrote:
>>
>>
>> On 4/30/20 10:23 PM, Mathieu Poirier wrote:
>>> On Wed, Apr 29, 2020 at 10:19:49AM +0200, Arnaud POULIQUEN wrote:
>>>>
>>>>
>>>> On 4/24/20 10:01 PM, Mathieu Poirier wrote:
>>>>> The remoteproc core must not allow function rproc_shutdown() to
>>>>> proceed if currently synchronising with a remote processor and
>>>>> the synchronisation operations of that remote processor does not
>>>>> support it. Also part of the process is to set the synchronisation
>>>>> flag so that the remoteproc core can make the right decisions when
>>>>> restarting the system.
>>>>>
>>>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
>>>>> ---
>>>>> drivers/remoteproc/remoteproc_core.c | 32 ++++++++++++++++++++++++
>>>>> drivers/remoteproc/remoteproc_internal.h | 7 ++++++
>>>>> 2 files changed, 39 insertions(+)
>>>>>
>>>>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
>>>>> index 3a84a38ba37b..48afa1f80a8f 100644
>>>>> --- a/drivers/remoteproc/remoteproc_core.c
>>>>> +++ b/drivers/remoteproc/remoteproc_core.c
>>>>> @@ -1849,6 +1849,27 @@ int rproc_boot(struct rproc *rproc)
>>>>> }
>>>>> EXPORT_SYMBOL(rproc_boot);
>>>>>
>>>>> +static bool rproc_can_shutdown(struct rproc *rproc)
>>>>> +{
>>>>> + /*
>>>>> + * The remoteproc core is the lifecycle manager, no problem
>>>>> + * calling for a shutdown.
>>>>> + */
>>>>> + if (!rproc_needs_syncing(rproc))
>>>>> + return true;
>>>>> +
>>>>> + /*
>>>>> + * The remoteproc has been loaded by another entity (as per above
>>>>> + * condition) and the platform code has given us the capability
>>>>> + * of stopping it.
>>>>> + */
>>>>> + if (rproc->sync_ops->stop)
>>>>> + return true;
>>>>
>>>> This means that if rproc->sync_ops->stop is null rproc_stop_subdevices will not
>>>> be called? seems not symmetric with the start sequence.
>>>
>>> If rproc->sync_ops->stop is not provided then the remoteproc core can't stop the
>>> remote processor at all after it has synchronised with it. If a usecase
>>> requires some kind of soft reset then a stop() function that uses a mailbox
>>> notification or some other mechanism can be provided to tell the remote
>>> processor to put itself back in startup mode again.
>>>
>>> Is this fine with you or there is still something I don't get?
>>
>> My point here is more around the subdevices. But perhaps i missed something...
>>
>> In rproc_start rproc_start_subdevices is called, even if sync_start is null.
>
> Here I'll take that you mean sync_ops::start()
>
>> But in rproc_shutdown rproc_stop is not called, if sync_ops->stop is null.
>> So rproc_stop_subdevices is not called in this case.
>
> Correct. I am pretty sure some people don't want the remoteproc core to be able
> to do anything other than synchronise with a remote processor, be it at boot
> time or when the remote processor has crashed.
>
> I can also see scenarios where people want to be able to start and stop
> subdevices from the remoteproc core, but _not_ power cycle the remote processor.
> In such cases the sync_ops::stop() should be some kind of notification telling
> the remote processor to put itself back in initialisation mode and
> sync_flags.after_stop should be set to true.
>
>> Then if sync_flags.after_stop is false, it looks like that something will go wrong
>> at next start.
>
> If sync_ops::stop is NULL then the value of sync_flags.after_stop becomes
> irrelevant because that state can't be reached. Let me know if you found a
> condition where this isn't the case and I will correct it.

The only condition i have in mind is that the sync_ops::stop() can not implemented
in platform driver, just because nothing to do. But i don't know if it is a realistic
use case and having a dummy stop function looks to me acceptable in this particular
use case.

This triggers me another comment :)
the rproc_ops struct description is relevant for the "normal" ops but not adapted
for the sync_ops. For instance the start & stop are mandatory for ops, optional for sync_ops
As this description is a reference (at least for me) to determine optional and mandatory ops
would be useful to update it.

Regards,
Arnaud
>
>>
>>>
>>>> Probably not useful to test it here as condition is already handled in rproc_stop_device...
>>>>
>>>> Regards
>>>> Arnaud
>>>>> +
>>>>> + /* Any other condition should not be allowed */
>>>>> + return false;
>>>>> +}
>>>>> +
>>>>> /**
>>>>> * rproc_shutdown() - power off the remote processor
>>>>> * @rproc: the remote processor
>>>>> @@ -1879,6 +1900,9 @@ void rproc_shutdown(struct rproc *rproc)
>>>>> return;
>>>>> }
>>>>>
>>>>> + if (!rproc_can_shutdown(rproc))
>>>>> + goto out;
>>>>> +
>>>>> /* if the remote proc is still needed, bail out */
>>>>> if (!atomic_dec_and_test(&rproc->power))
>>>>> goto out;
>>>>> @@ -1898,6 +1922,14 @@ void rproc_shutdown(struct rproc *rproc)
>>>>> kfree(rproc->cached_table);
>>>>> rproc->cached_table = NULL;
>>>>> rproc->table_ptr = NULL;
>>>>> +
>>>>> + /*
>>>>> + * The remote processor has been switched off - tell the core what
>>>>> + * operation to use from hereon, i.e whether an external entity will
>>>>> + * reboot the remote processor or it is now the remoteproc core's
>>>>> + * responsability.
>>>>> + */
>>>>> + rproc_set_sync_flag(rproc, RPROC_SYNC_STATE_SHUTDOWN);
>>>>> out:
>>>>> mutex_unlock(&rproc->lock);
>>>>> }
>>>>> diff --git a/drivers/remoteproc/remoteproc_internal.h b/drivers/remoteproc/remoteproc_internal.h
>>>>> index 61500981155c..7dcc0a26892b 100644
>>>>> --- a/drivers/remoteproc/remoteproc_internal.h
>>>>> +++ b/drivers/remoteproc/remoteproc_internal.h
>>>>> @@ -27,6 +27,9 @@ struct rproc_debug_trace {
>>>>> /*
>>>>> * enum rproc_sync_states - remote processsor sync states
>>>>> *
>>>>> + * @RPROC_SYNC_STATE_SHUTDOWN state to use after the remoteproc core
>>>>> + * has shutdown (rproc_shutdown()) the
>>>>> + * remote processor.
>>>>> * @RPROC_SYNC_STATE_CRASHED state to use after the remote processor
>>>>> * has crashed but has not been recovered by
>>>>> * the remoteproc core yet.
>>>>> @@ -36,6 +39,7 @@ struct rproc_debug_trace {
>>>>> * operation to use.
>>>>> */
>>>>> enum rproc_sync_states {
>>>>> + RPROC_SYNC_STATE_SHUTDOWN,
>>>>> RPROC_SYNC_STATE_CRASHED,
>>>>> };
>>>>>
>>>>> @@ -43,6 +47,9 @@ static inline void rproc_set_sync_flag(struct rproc *rproc,
>>>>> enum rproc_sync_states state)
>>>>> {
>>>>> switch (state) {
>>>>> + case RPROC_SYNC_STATE_SHUTDOWN:
>>>>> + rproc->sync_with_rproc = rproc->sync_flags.after_stop;
>>>>> + break;
>>>>> case RPROC_SYNC_STATE_CRASHED:
>>>>> rproc->sync_with_rproc = rproc->sync_flags.after_crash;
>>>>> break;
>>>>>