Re: [PATCH v3 00/30] Live Update Orchestrator

From: Pasha Tatashin
Date: Fri Aug 08 2025 - 09:52:55 EST


On Fri, Aug 8, 2025 at 12:07 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 07.08.25 03:44, Pasha Tatashin wrote:
> > This series introduces the LUO, a kernel subsystem designed to
> > facilitate live kernel updates with minimal downtime,
> > particularly in cloud delplyoments aiming to update without fully
> > disrupting running virtual machines.
> >
> > This series builds upon KHO framework by adding programmatic
> > control over KHO's lifecycle and leveraging KHO for persisting LUO's
> > own metadata across the kexec boundary. The git branch for this series
> > can be found at:
> >
> > https://github.com/googleprodkernel/linux-liveupdate/tree/luo/v3
> >
> > Changelog from v2:
> > - Addressed comments from Mike Rapoport and Jason Gunthorpe
> > - Only one user agent (LiveupdateD) can open /dev/liveupdate
> > - Release all preserved resources if /dev/liveupdate closes
> > before reboot.
> > - With the above changes, sessions are not needed, and should be
> > maintained by the user-agent itself, so removed support for
> > sessions.
> > - Added support for changing per-FD state (i.e. some FDs can be
> > prepared or finished before the global transition.
> > - All IOCTLs now follow iommufd/fwctl extendable design.
> > - Replaced locks with guards
> > - Added a callback for registered subsystems to be notified
> > during boot: ops->boot().
> > - Removed args from callbacks, instead use container_of() to
> > carry context specific data (see luo_selftests.c for example).
> > - removed patches for luolib, they are going to be introduced in
> > a separate repository.
> >
> > What is Live Update?
> > Live Update is a kexec based reboot process where selected kernel
> > resources (memory, file descriptors, and eventually devices) are kept
> > operational or their state preserved across a kernel transition. For
> > certain resources, DMA and interrupt activity might continue with
> > minimal interruption during the kernel reboot.
> >
> > LUO provides a framework for coordinating live updates. It features:
> > State Machine: Manages the live update process through states:
> > NORMAL, PREPARED, FROZEN, UPDATED.
> >
> > KHO Integration:
> >
> > LUO programmatically drives KHO's finalization and abort sequences.
> > KHO's debugfs interface is now optional configured via
> > CONFIG_KEXEC_HANDOVER_DEBUG.
> >
> > LUO preserves its own metadata via KHO's kho_add_subtree and
> > kho_preserve_phys() mechanisms.
> >
> > Subsystem Participation: A callback API liveupdate_register_subsystem()
> > allows kernel subsystems (e.g., KVM, IOMMU, VFIO, PCI) to register
> > handlers for LUO events (PREPARE, FREEZE, FINISH, CANCEL) and persist a
> > u64 payload via the LUO FDT.
> >
> > File Descriptor Preservation: Infrastructure
> > liveupdate_register_filesystem, luo_register_file, luo_retrieve_file to
> > allow specific types of file descriptors (e.g., memfd, vfio) to be
> > preserved and restored.
> >
> > Handlers for specific file types can be registered to manage their
> > preservation and restoration, storing a u64 payload in the LUO FDT.
> >
> > User-space Interface:
> >
> > ioctl (/dev/liveupdate): The primary control interface for
> > triggering LUO state transitions (prepare, freeze, finish, cancel)
> > and managing the preservation/restoration of file descriptors.
> > Access requires CAP_SYS_ADMIN.
> >
> > sysfs (/sys/kernel/liveupdate/state): A read-only interface for
> > monitoring the current LUO state. This allows userspace services to
> > track progress and coordinate actions.
> >
> > Selftests: Includes kernel-side hooks and userspace selftests to
> > verify core LUO functionality, particularly subsystem registration and
> > basic state transitions.
> >
> > LUO State Machine and Events:
> >
> > NORMAL: Default operational state.
> > PREPARED: Initial preparation complete after LIVEUPDATE_PREPARE
> > event. Subsystems have saved initial state.
> > FROZEN: Final "blackout window" state after LIVEUPDATE_FREEZE
> > event, just before kexec. Workloads must be suspended.
> > UPDATED: Next kernel has booted via live update. Awaiting restoration
> > and LIVEUPDATE_FINISH.
> >
> > Events:
> > LIVEUPDATE_PREPARE: Prepare for reboot, serialize state.
> > LIVEUPDATE_FREEZE: Final opportunity to save state before kexec.
> > LIVEUPDATE_FINISH: Post-reboot cleanup in the next kernel.
> > LIVEUPDATE_CANCEL: Abort prepare or freeze, revert changes.
> >
> > v2: https://lore.kernel.org/all/20250723144649.1696299-1-pasha.tatashin@xxxxxxxxxx
> > v1: https://lore.kernel.org/all/20250625231838.1897085-1-pasha.tatashin@xxxxxxxxxx
> > RFC v2: https://lore.kernel.org/all/20250515182322.117840-1-pasha.tatashin@xxxxxxxxxx
> > RFC v1: https://lore.kernel.org/all/20250320024011.2995837-1-pasha.tatashin@xxxxxxxxxx
> >
> > Changyuan Lyu (1):
> > kho: add interfaces to unpreserve folios and physical memory ranges
> >
> > Mike Rapoport (Microsoft) (1):
> > kho: drop notifiers
> >
> > Pasha Tatashin (23):
> > kho: init new_physxa->phys_bits to fix lockdep
> > kho: mm: Don't allow deferred struct page with KHO
> > kho: warn if KHO is disabled due to an error
> > kho: allow to drive kho from within kernel
> > kho: make debugfs interface optional
> > kho: don't unpreserve memory during abort
> > liveupdate: kho: move to kernel/liveupdate
> > liveupdate: luo_core: luo_ioctl: Live Update Orchestrator
> > liveupdate: luo_core: integrate with KHO
> > liveupdate: luo_subsystems: add subsystem registration
> > liveupdate: luo_subsystems: implement subsystem callbacks
> > liveupdate: luo_files: add infrastructure for FDs
> > liveupdate: luo_files: implement file systems callbacks
> > liveupdate: luo_ioctl: add userpsace interface
> > liveupdate: luo_files: luo_ioctl: Unregister all FDs on device close
> > liveupdate: luo_files: luo_ioctl: Add ioctls for per-file state
> > management
> > liveupdate: luo_sysfs: add sysfs state monitoring
> > reboot: call liveupdate_reboot() before kexec
> > kho: move kho debugfs directory to liveupdate
> > liveupdate: add selftests for subsystems un/registration
> > selftests/liveupdate: add subsystem/state tests
> > docs: add luo documentation
> > MAINTAINERS: add liveupdate entry
> >
> > Pratyush Yadav (5):
> > mm: shmem: use SHMEM_F_* flags instead of VM_* flags
> > mm: shmem: allow freezing inode mapping
> > mm: shmem: export some functions to internal.h
> > luo: allow preserving memfd
> > docs: add documentation for memfd preservation via LUO
>
> It's not clear from the description why these mm shmem changes are
> buried in this patch set. It's not even described above in the patch
> description.

Hi David,

Yes, I should update the cover letter to include memfd preservation work.

> I suggest sending that part out separately, so Hugh actually spots this.
> (is he even CC'ed?)

+cc hughd@xxxxxxxxxx

While MM list is CCed, you are right, I have not specifically CCed
shmem maintainers. This will be fixed in the next revision.

Thank you,
Pasha