[PATCH v3 0/3] Asynchronous shutdown interface and example implementation

From: Tanjore Suresh
Date: Tue May 17 2022 - 18:08:31 EST


Problem:

Some of our machines are configured with many NVMe devices and
are validated for strict shutdown time requirements. Each NVMe
device plugged into the system, typicaly takes about 4.5 secs
to shutdown. A system with 16 such NVMe devices will takes
approximately 80 secs to shutdown and go through reboot.

The current shutdown APIs as defined at bus level is defined to be
synchronous. Therefore, more devices are in the system the greater
the time it takes to shutdown. This shutdown time significantly
contributes the machine reboot time.

Solution:

This patch set proposes an asynchronous shutdown interface at bus level,
modifies the core driver, device shutdown routine to exploit the
new interface while maintaining backward compatibility with synchronous
implementation already existing (Patch 1 of 3) and exploits new interface
to enable all PCI-E based devices to use asynchronous interface semantics
if necessary (Patch 2 of 3). The implementation at PCI-E level also works
in a backward compatible way, to allow exiting device implementation
to work with current synchronous semantics. Only show cases an example
implementation for NVMe device to exploit this asynchronous shutdown
interface. (Patch 3 of 3).

Changelog:

v2: - Replaced the shutdown_pre & shutdown_post entry point names with the
recommended names (async_shutdown_start and asynch_shutdown_end).

- Comment about ordering requirements between bridge shutdown versus
leaf/endpoint shutdown was agreed to be different when calling
async_shutdown_start and async_shutdown_end. Now this implements the
same order of calling both start and end entry points.

v3: - This notes clarifies why power management framework was not
considered for implementing this shutdown optimization.
There is no code change done. This change notes clarfies
the reasoning only.

This patch is only for shutdown of the system. The shutdown
entry points are traditionally have different requirement
where all devices are brought to a quiescent state and then
system power may be removed (power down request scenarios)
and also the same entry point is used to shutdown all devices
and re-initialized and restarted (soft shutdown/reboot
scenarios).

Whereas, the device power management (dpm) allows the device
to bring down any device configured in the system that may be
idle to various low power states that the device may support
in a selective manner and based on transitions that device
implementation allows. The power state transitions initiated
by the system can be achieved using 'dpm' interfaces already
specified.

Therefore the request to use the 'dpm' interface to achieve
this shutdown optimization is not the right approach as the
suggested interface is meant to solve an orthogonal requirement
and have historically been kept separate from the shutdown entry
points defined and its associated semantics.

Tanjore Suresh (3):
driver core: Support asynchronous driver shutdown
PCI: Support asynchronous shutdown
nvme: Add async shutdown support

drivers/base/core.c | 38 +++++++++++++++++-
drivers/nvme/host/core.c | 28 +++++++++----
drivers/nvme/host/nvme.h | 8 ++++
drivers/nvme/host/pci.c | 80 ++++++++++++++++++++++++--------------
drivers/pci/pci-driver.c | 20 ++++++++--
include/linux/device/bus.h | 12 ++++++
include/linux/pci.h | 4 ++
7 files changed, 149 insertions(+), 41 deletions(-)

--
2.36.0.550.gb090851708-goog