Re: [PATCH v19 2/7] crash: add generic infrastructure for crash hotplug support

From: Eric DeVolder
Date: Fri Mar 17 2023 - 14:14:26 EST




On 3/17/23 04:04, Baoquan He wrote:
On 03/16/23 at 09:44am, Eric DeVolder wrote:


On 3/16/23 05:11, Baoquan He wrote:
On 03/06/23 at 11:22am, Eric DeVolder wrote:
......
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
+{
+ /* Obtain lock while changing crash information */
+ if (kexec_trylock()) {
+
+ /* Check kdump is loaded */
+ if (kexec_crash_image) {
+ struct kimage *image = kexec_crash_image;
+
+ if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||
+ hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
+ pr_debug("hp_action %u, cpu %u\n", hp_action, cpu);
+ else
+ pr_debug("hp_action %u\n", hp_action);
+
+ /*
+ * When the struct kimage is allocated, the elfcorehdr_index
+ * is set to -1. Find the segment containing the elfcorehdr,
+ * if not already found. This works for both the kexec_load
+ * and kexec_file_load paths.
+ */
+ if (image->elfcorehdr_index < 0) {
+ unsigned long mem;
+ unsigned char *ptr;
+ unsigned int n;
+
+ for (n = 0; n < image->nr_segments; n++) {
+ mem = image->segment[n].mem;
+ ptr = kmap_local_page(pfn_to_page(mem >> PAGE_SHIFT));
+ if (ptr) {
+ /* The segment containing elfcorehdr */
+ if (memcmp(ptr, ELFMAG, SELFMAG) == 0) {
+ image->elfcorehdr_index = (int)n;
+ }
+ kunmap_local(ptr);
+ }
+ }
+ }
+
+ if (image->elfcorehdr_index < 0) {
+ pr_err("unable to locate elfcorehdr segment");
+ goto out;
+ }
+
+ /* Needed in order for the segments to be updated */
+ arch_kexec_unprotect_crashkres();
+
+ /* Differentiate between normal load and hotplug update */
+ image->hp_action = hp_action;
+
+ /* Now invoke arch-specific update handler */
+ arch_crash_handle_hotplug_event(image);
+
+ /* No longer handling a hotplug event */
+ image->hp_action = KEXEC_CRASH_HP_NONE;
+ image->elfcorehdr_updated = true;

It's good to initialize the image->hp_action here, however where do
you check it? Do you plan to add some check somewhere?

Hi Baoquan,
The hp_action member is initialized to 0 in do_image_alloc_init(). I've
mapped KEXEC_CRASH_HP_NONE onto 0 on purpose.

But the use of image->hp_action = KEXEC_CRASH_HP_NONE is to actually
delineate that a hotplug event handling has completed. You can see
imae->hp_action set to hp_action to capture what the triggering event
was, as passed into this function.

I will go ahead and set image->hp_action = KEXEC_CRASH_HP_NONE; explicitly
in do_kimage_alloc_init(), as that is done for the other crash hotplug members.

Yeah, setting image->hp_action = KEXEC_CRASH_HP_NONE in
do_kimage_alloc_init() will make code clearer. While I am wondering if
we don't initialie image->hp_action to KEXEC_CRASH_HP_NONE, and don't
set image->hp_action to KEXEC_CRASH_HP_NONE to actually delineate that a
hotplug event handling has completed, what will happen?

Baoquan,
The KEXEC_CRASH_HP_NONE is the value 0, intentionally so that upon a alloc
of the kimage, the struct kimage is automatically zeroed and it was initialized
properly that way. I am explicitly setting hp_action to KEXEC_CRASH_HP_NONE now.


I mean you set image->hp_action to KEXEC_CRASH_HP_NONE explicitly, where
do you check if it should not be KEXEC_CRASH_HP_NONE? In
crash_handle_hotplug_event(), we took __kexec_lock and assign the passed
hp_action anyway.

The cpuhp callbacks and memory notifiers invoke crash_handle_hotplug_event()
with an appropriate hp_action. That hp_action is then stored in image->hp_action
within crash_handle_hotplug_event() for use by the arch-specific handler.

For x86, for example, the image->hp_action is used to short-circuit the arch-
specific handler if the event is a CPU plug/unplug (see patch x86/crash:
optimize cpu changes). For PPC, for example, the image->hp_action is used to
determine the appropriate actions for its FDT updates.

To summarize, the image->hp_action will be initalized to KEXEC_CRASH_HP_NONE
during do_kimage_alloc_init(). Then upon a cpu or memory plug/unplug/online/offline
event, the appropriate hp_action is stored in image->hp_action and then the
arch-specific handler called. Upon returning from the arch-specific handler,
the image->hp_action is reset back to KEXEC_CRASH_HP_NONE.

Hope this helps. I'll be posting v20 soon.
Thanks!
eric