Re: [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get

From: Moshe Shemesh
Date: Thu Jul 30 2020 - 08:05:16 EST



On 7/30/2020 12:11 AM, Jakub Kicinski wrote:
On Wed, 29 Jul 2020 17:37:41 +0300 Moshe Shemesh wrote:
The fact that the driver supports fw_live_patch, does not necessarily
mean that the currently running FW can be live upgraded to the
currently flashed one, right?
That's correct, though the feature is supported, the firmware gap may
not be suitable for live_patch.

The user will be noted accordingly by extack message.
That's kinda late, because use may have paid the cost of migrating the
workload or otherwise taking precautions - and if live reset fails all
this work is wasted.

While the device most likely knows upfront whether it can be live reset
or not, otherwise I don't see how it could reject the reset reliably.


The device knows if the new FW can be updated by live-patch or need reset once the new version is stored and it so it can check the gaps.

So once the new FW is stored I can query if it is a change that can do by live_patch or need full fw_reset.

This interface does not appear to be optimal for the purpose.

Again, documentation of what can be lost (in terms of configuration and
features) upon upgrade is missing.
I will clarify in documentation. On live_patch nothing should be lost or
re-initialized, that's the "live" thing.
Okay, so FW upgrade cannot be allowed when it'd mean the device gets
de-featured? Also no link loss, correct? What's the expected length of
traffic interruption (order of magnitude)?


That's different between fw_live_patch and fw_reset, that's why I see it as different level.

The live_patch is totally live, no link loss, no data interruption at all.

But when the firmware gap for upgrade is not suitable for live patch, the user can choose to do full fw reset, that can include link loss (depends on device) for few seconds and some configuration which is not saved by the driver or was not configured through the driver (some other tool) need to re-configure.