Re: [PATCH v2 1/4] sched/task: Add the put_task_struct_atomic_safe function

From: luca abeni
Date: Fri Feb 24 2023 - 11:02:00 EST


On Fri, 24 Feb 2023 10:02:40 -0300
Wander Lairson Costa <wander@xxxxxxxxxx> wrote:
[...]
> > [ 1246.556100] =============================
> > [ 1246.559104] [ BUG: Invalid wait context ]
> > [ 1246.562270] 6.2.0 #4 Not tainted
> > [ 1246.564854] -----------------------------
> > [ 1246.567260] swapper/3/0 is trying to lock:
> > [ 1246.568665] ffff8c2c7ebb2c10 (&c->lock){..-.}-{3:3}, at:
> > put_cpu_partial+0x24/0x1c0 [ 1246.571325] other info that might
> > help us debug this: [ 1246.573045] context-{2:2}
> > [ 1246.574166] no locks held by swapper/3/0.
> > [ 1246.575434] stack backtrace:
> > [ 1246.576207] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.2.0 #4
> > [ 1246.578184] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 1246.580815] Call Trace:
> > [ 1246.581723] <IRQ>
> > [ 1246.582570] dump_stack_lvl+0x49/0x61
> > [ 1246.583860] __lock_acquire.cold+0xc8/0x31c
> > [ 1246.584923] ? __lock_acquire+0x3be/0x1df0
> > [ 1246.585915] lock_acquire+0xce/0x2f0
> > [ 1246.586819] ? put_cpu_partial+0x24/0x1c0
> > [ 1246.588177] ? lock_is_held_type+0xdb/0x130
> > [ 1246.589519] put_cpu_partial+0x5b/0x1c0
> > [ 1246.590996] ? put_cpu_partial+0x24/0x1c0
> > [ 1246.592212] inactive_task_timer+0x263/0x4c0
> > [ 1246.593509] ? __pfx_inactive_task_timer+0x10/0x10
> > [ 1246.594953] __hrtimer_run_queues+0x1bf/0x470
> > [ 1246.596297] hrtimer_interrupt+0x117/0x250
> > [ 1246.597528] __sysvec_apic_timer_interrupt+0x99/0x270
> > [ 1246.599015] sysvec_apic_timer_interrupt+0x8d/0xc0
> > [ 1246.600416] </IRQ>
> > [ 1246.601170] <TASK>
> > [ 1246.601918] asm_sysvec_apic_timer_interrupt+0x1a/0x20
> > [ 1246.603377] RIP: 0010:default_idle+0xf/0x20
> > [ 1246.604640] Code: f6 5d 41 5c e9 72 4a 6e ff cc cc 90 90 90 90
> > 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 03
> > 52 2a 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> > 90 90 90 90 90 [ 1246.609718] RSP: 0018:ffffa1a2c009bed0 EFLAGS:
> > 00000202 [ 1246.611259] RAX: ffffffffa4961a60 RBX: ffff8c2c4126b000
> > RCX: 0000000000000000 [ 1246.613230] RDX: 0000000000000000 RSI:
> > ffffffffa510271b RDI: ffffffffa50d5b15 [ 1246.615266] RBP:
> > 0000000000000003 R08: 0000000000000001 R09: 0000000000000001 [
> > 1246.617275] R10: 0000000000000000 R11: ffff8c2c4126b000 R12:
> > ffff8c2c4126b000 [ 1246.619318] R13: ffff8c2c4126b000 R14:
> > 0000000000000000 R15: 0000000000000000 [ 1246.621293] ?
> > __pfx_default_idle+0x10/0x10 [ 1246.622581]
> > default_idle_call+0x71/0x220 [ 1246.623790] do_idle+0x210/0x290 [
> > 1246.624827] cpu_startup_entry+0x18/0x20 [ 1246.626016]
> > start_secondary+0xf1/0x100 [ 1246.627200]
> > secondary_startup_64_no_verify+0xe0/0xeb [ 1246.628707] </TASK>
> >
> >
> > Let me know if you need more information, or
> > I should run other tests/experiments.
> >
>
> This seems to be a different (maybe related?) issue. Would you mind
> sharing your .config and steps to reproduce it?

Ah, sorry then... I probably misunderstood the kernel messages
(in my understanding, this is lockdep complaining because
put_task_struct() - which can take a sleeping lock - is invoked
from a timer callback).


Anyway, I attach the config (it is basically a "make defconfig;
make kvm_guest.config" with some debug options manually enabled - I
think the relevant one is CONFIG_PROVE_RAW_LOCK_NESTING)


Luca

Attachment: config
Description: Binary data