Re: [PATCH v3 0/1] xen: fix HVM kexec kernel panic

From: Boris Ostrovsky
Date: Tue Mar 01 2022 - 12:23:42 EST



On 2/28/22 11:56 PM, Dongli Zhang wrote:
Hi Boris,

On 2/28/22 5:18 PM, Dongli Zhang wrote:
Hi Boris,

On 2/28/22 12:45 PM, Boris Ostrovsky wrote:

On 2/25/22 8:17 PM, Dongli Zhang wrote:
Hi Boris,

On 2/25/22 2:39 PM, Boris Ostrovsky wrote:
On 2/24/22 4:50 PM, Dongli Zhang wrote:
This is the v3 of the patch to fix xen kexec kernel panic issue when the
kexec is triggered on VCPU >= 32.

PANIC: early exception 0x0e IP 10:ffffffffa96679b6 error 0 cr2 0x20
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
5.17.0-rc4xen-00054-gf71077a4d84b-dirty #1
[    0.000000] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 12/15/2020
[    0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0
... ...
[    0.000000] RSP: 0000:ffffffffaae03e10 EFLAGS: 00010082 ORIG_RAX:
0000000000000000
[    0.000000] RAX: 0000000000000000 RBX: 0000000000010000 RCX:
0000000000000002
[    0.000000] RDX: 0000000000000003 RSI: ffffffffaac37515 RDI:
0000000000000020
[    0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09:
0000000000000001
[    0.000000] R10: ffffffffaae03df8 R11: ffffffffaae03c68 R12:
0000000040000004
[    0.000000] R13: ffffffffaae03e50 R14: 0000000000000000 R15:
0000000000000000
[    0.000000] FS:  0000000000000000(0000) GS:ffffffffab588000(0000)
knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.000000] CR2: 0000000000000020 CR3: 00000000ea410000 CR4:
00000000000406a0
[    0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[    0.000000] Call Trace:
[    0.000000]  <TASK>
[    0.000000]  ? xen_clocksource_read+0x24/0x40

This is done to set xen_sched_clock_offset which I think will not be used for a
while, until sched_clock is called (and the other two uses are for
suspend/resume)


Can we simply defer 'xen_sched_clock_offset = xen_clocksource_read();' until
after all vcpu areas are properly set? Or are there other uses of
xen_clocksource_read() before ?

I have tested that below patch will panic kdump kernel.



Oh well, so much for that then. Yes, sched_clock() is at least called from
printk path.


I guess we will have to go with v2 then, we don't want to start seeing time
going back, even if only with older hypervisors. The only thing I might ask is
that you roll the logic inside xen_hvm_init_time_ops(). Something like


xen_hvm_init_time_ops()
{
    /*
     * Wait until per_cpu(xen_vcpu, 0) is initialized which may happen
     * later (e.g. when kdump kernel runs on >=MAX_VIRT_CPUS vcpu)
     */
    if (__this_cpu_read(xen_vcpu_nr(0)) == NULL)
        return;

I think you meant __this_cpu_read(xen_vcpu).

I will call xen_hvm_init_time_ops() at both places, and move the logic into
xen_hvm_init_time_ops().

Thank you very much!

Dongli Zhang


How about we do not move the logic into xen_hvm_init_time_ops()?

Suppose we move the logic into xen_hvm_init_time_ops() line 573, the line line
570 might be printed twice.


You would need to make sure the routine is executed only once so something like a local static variable would be needed.




559 void __init xen_hvm_init_time_ops(void)
560 {
561 /*
562 * vector callback is needed otherwise we cannot receive interrupts
563 * on cpu > 0 and at this point we don't know how many cpus are
564 * available.
565 */
566 if (!xen_have_vector_callback)
567 return;
568
569 if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
570 pr_info("Xen doesn't support pvclock on HVM, disable pv timer");
571 return;
572 }
573
574 xen_init_time_common();
575
576 x86_init.timers.setup_percpu_clockev = xen_time_init;
577 x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
578
579 x86_platform.set_wallclock = xen_set_wallclock;
580 }

I feel the code looks better if we keep the logic at caller side. Would you mind
letting me know your feedback?


My preference is to keep logic concentrated in one place whenever possible.


-boris