Recent VFS/LSM patches cause Kernel panic - not syncing: Can't create rootfs

From: Dexuan Cui
Date: Wed Dec 19 2018 - 23:33:47 EST


Hi,
We started to see a "Can't create rootfs" panic with linux-next's
next-20181218 and next-20181219. Note: next-20181217 is good.

Our test team found the first bad commit by git-bisect:
013c7af575e5 ("vfs: Implement a filesystem superblock creation/configuration context")

I had a look and I think another patch also helped to cause the panic:
c36d02347290 ("apparmor: Implement security hooks for the new mount API")

My kernel config for next-20181218, and my dmesg are attached.
I can always reproduce the panic every time I boot up the kernel.

My finding is: the panic happens because
start_kernel() -> vfs_caches_init() -> mnt_init() ->
sysfs_init() -> register_filesystem() -> init_mount_tree() ->
vfs_kern_mount(type, 0, "rootfs", NULL) -> vfs_get_tree() ->
security_sb_set_mnt_opts(sb, fc->security, 0, NULL) returns -EOPNOTSUPP:

int security_sb_set_mnt_opts(struct super_block *sb,
void *mnt_opts,
unsigned long kern_flags,
unsigned long *set_kern_flags)
{
return call_int_hook(sb_set_mnt_opts,
mnt_opts ? -EOPNOTSUPP : 0, sb,
mnt_opts, kern_flags, set_kern_flags);
}

This means: fc->security is not NULL in
security_sb_set_mnt_opts(sb, fc->security, 0, NULL), and the
security_hook_heads.FUNC is empty in call_int_hook().

The fc->security is assigned in this function (i.e. the line "fc->security = afc;" ):

static int apparmor_fs_context_parse_param(struct fs_context *fc,
struct fs_parameter *param)
{
struct apparmor_fs_context *afc = fc->security;
const char *value;
size_t space = 0, k_len = strlen(param->key), len = k_len, v_len;
char *p, *q;

if (!afc) {
afc = kzalloc(sizeof(*afc), GFP_KERNEL);
fc->security = afc;
}

apparmor_fs_context_parse_param() is added recently in:
c36d02347290 ("apparmor: Implement security hooks for the new mount API")

Unluckily I know nothing about LSM, so I'm not sure if the bug is in the VFS or
LSM. Here let me Cc the related people. I suppose somebody would give a quick fix.

Thanks!
-- Dexuan

Attachment: config.txt.tar.gz
Description: config.txt.tar.gz

[ 0.000000] Linux version 4.20.0-rc7-next-20181218+ (root@decui-1604) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)) #4 SMP Wed Dec 19 19:44:22 PST 2018
[ 0.000000] Command line: BOOT_IMAGE=/boot/hv/bzImage root=UUID=9bb51693-01d7-4323-a5e4-54b065696092 ro ignore_loglevel console=tty1 c onsole=ttyS0
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Hygon HygonGenuine
[ 0.000000] Centaur CentaurHauls
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64
[ 0.000000] x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64
[ 0.000000] x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000f7feffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000f7ff0000-0x00000000f7ffefff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x00000000f7fff000-0x00000000f7ffffff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000287ffffff] usable
[ 0.000000] printk: debug: ignoring loglevel setting.
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.3 present.
[ 0.000000] DMI: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018
[ 0.000000] Hypervisor detected: Microsoft Hyper-V
[ 0.000000] Hyper-V: features 0x2e7f, hints 0x20c2c
[ 0.000000] Hyper-V Host Build:17713-10.0-1-0.1044
[ 0.000000] Hyper-V: LAPIC Timer Frequency: 0xc3500
[ 0.000000] tsc: Marking TSC unstable due to running on Hyper-V
[ 0.000000] Hyper-V: Using hypercall for remote TLB flush
[ 0.000000] tsc: Detected 3408.000 MHz processor
[ 0.001231] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[ 0.001232] e820: remove [mem 0x000a0000-0x000fffff] usable
[ 0.001235] last_pfn = 0x288000 max_arch_pfn = 0x400000000
[ 0.001279] MTRR default type: uncachable
[ 0.001280] MTRR fixed ranges enabled:
[ 0.001281] 00000-9FFFF write-back
[ 0.001282] A0000-DFFFF uncachable
[ 0.001282] E0000-FFFFF write-back
[ 0.001283] MTRR variable ranges enabled:
[ 0.001296] 0 base 0000000000 mask 7F00000000 write-back
[ 0.001297] 1 base 0100000000 mask 7000000000 write-back
[ 0.001297] 2 disabled
[ 0.001298] 3 disabled
[ 0.001298] 4 disabled
[ 0.001299] 5 disabled
[ 0.001299] 6 disabled
[ 0.001299] 7 disabled
[ 0.001309] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.001335] last_pfn = 0xf7ff0 max_arch_pfn = 0x400000000
[ 0.007052] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at [(____ptrval____)]
[ 0.007185] check: Scanning 1 areas for low memory corruption
[ 0.007187] Base memory trampoline at [(____ptrval____)] 99000 size 24576
[ 0.007656] Using GB pages for direct mapping
[ 0.007662] BRK [0x265e02000, 0x265e02fff] PGTABLE
[ 0.007680] BRK [0x265e03000, 0x265e03fff] PGTABLE
[ 0.007680] BRK [0x265e04000, 0x265e04fff] PGTABLE
[ 0.007698] BRK [0x265e05000, 0x265e05fff] PGTABLE
[ 0.007713] BRK [0x265e06000, 0x265e06fff] PGTABLE
[ 0.007787] BRK [0x265e07000, 0x265e07fff] PGTABLE
[ 0.007801] BRK [0x265e08000, 0x265e08fff] PGTABLE
[ 0.007822] RAMDISK: [mem 0x15bc2000-0x26dd8fff]
[ 0.007912] ACPI: Early table checksum verification disabled
[ 0.007944] ACPI: RSDP 0x00000000000F5BF0 000014 (v00 ACPIAM)
[ 0.007947] ACPI: RSDT 0x00000000F7FF0000 000040 (v01 VRTUAL MICROSFT 05001818 MSFT 00000097)
[ 0.007950] ACPI: FACP 0x00000000F7FF0200 000081 (v02 VRTUAL MICROSFT 05001818 MSFT 00000097)
[ 0.007953] ACPI: DSDT 0x00000000F7FF1D24 003CD5 (v01 MSFTVM MSFTVM02 00000002 INTL 02002026)
[ 0.007954] ACPI: FACS 0x00000000F7FFF000 000040
[ 0.007956] ACPI: WAET 0x00000000F7FF1A80 000028 (v01 VRTUAL MICROSFT 05001818 MSFT 00000097)
[ 0.007958] ACPI: SLIC 0x00000000F7FF1AC0 000176 (v01 VRTUAL MICROSFT 05001818 MSFT 00000097)
[ 0.007959] ACPI: OEM0 0x00000000F7FF1CC0 000064 (v01 VRTUAL MICROSFT 05001818 MSFT 00000097)
[ 0.007961] ACPI: SRAT 0x00000000F7FF0800 000150 (v02 VRTUAL MICROSFT 00000001 MSFT 00000001)
[ 0.007962] ACPI: APIC 0x00000000F7FF0300 000452 (v01 VRTUAL MICROSFT 05001818 MSFT 00000097)
[ 0.007964] ACPI: OEMB 0x00000000F7FFF040 000064 (v01 VRTUAL MICROSFT 05001818 MSFT 00000097)
[ 0.007969] ACPI: Local APIC address 0xfee00000
[ 0.007999] SRAT: PXM 0 -> APIC 0x00 -> Node 0
[ 0.008000] SRAT: PXM 0 -> APIC 0x01 -> Node 0
[ 0.008000] SRAT: PXM 0 -> APIC 0x02 -> Node 0
[ 0.008001] SRAT: PXM 0 -> APIC 0x03 -> Node 0
[ 0.008001] SRAT: PXM 0 -> APIC 0x04 -> Node 0
[ 0.008001] SRAT: PXM 0 -> APIC 0x05 -> Node 0
[ 0.008002] SRAT: PXM 0 -> APIC 0x06 -> Node 0
[ 0.008002] SRAT: PXM 0 -> APIC 0x07 -> Node 0
[ 0.008004] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0xf7ffffff] hotplug
[ 0.008005] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x287ffffff] hotplug
[ 0.008005] ACPI: SRAT: Node 0 PXM 0 [mem 0x288000000-0xfdfffffff] hotplug
[ 0.008006] ACPI: SRAT: Node 0 PXM 0 [mem 0x1000000000-0xffffffffff] hotplug
[ 0.008008] NUMA: Node 0 [mem 0x00000000-0xf7ffffff] + [mem 0x100000000-0x287ffffff] -> [mem 0x00000000-0x287ffffff]
[ 0.008013] NODE_DATA(0) allocated [mem 0x287fd5000-0x287ffffff]
[ 0.008156] Zone ranges:
[ 0.008157] DMA [mem 0x0000000000001000-0x0000000000ffffff]
[ 0.008158] DMA32 [mem 0x0000000001000000-0x00000000ffffffff]
[ 0.008159] Normal [mem 0x0000000100000000-0x0000000287ffffff]
[ 0.008159] Device empty
[ 0.008160] Movable zone start for each node
[ 0.008162] Early memory node ranges
[ 0.008163] node 0: [mem 0x0000000000001000-0x000000000009efff]
[ 0.008163] node 0: [mem 0x0000000000100000-0x00000000f7feffff]
[ 0.008164] node 0: [mem 0x0000000100000000-0x0000000287ffffff]
[ 0.008166] Zeroed struct page in unavailable ranges: 114 pages
[ 0.008166] Initmem setup node 0 [mem 0x0000000000001000-0x0000000287ffffff]
[ 0.008167] On node 0 totalpages: 2621326
[ 0.008168] DMA zone: 64 pages used for memmap
[ 0.008168] DMA zone: 21 pages reserved
[ 0.008169] DMA zone: 3998 pages, LIFO batch:0
[ 0.008196] DMA32 zone: 16320 pages used for memmap
[ 0.008196] DMA32 zone: 1011696 pages, LIFO batch:63
[ 0.015082] Normal zone: 25088 pages used for memmap
[ 0.015083] Normal zone: 1605632 pages, LIFO batch:63
[ 0.030785] ACPI: PM-Timer IO Port: 0x408
[ 0.030787] ACPI: Local APIC address 0xfee00000
[ 0.030793] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[ 0.032161] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[ 0.032163] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.032164] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.032165] ACPI: IRQ0 used by override.
[ 0.032166] ACPI: IRQ9 used by override.
[ 0.032168] Using ACPI (MADT) for SMP configuration information
[ 0.032173] smpboot: Allowing 8 CPUs, 0 hotplug CPUs
[ 0.032179] [mem 0xf8000000-0xffffffff] available for PCI devices
[ 0.032192] Booting paravirtualized kernel on bare hardware
[ 0.032194] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[ 0.032200] random: get_random_bytes called from start_kernel+0x93/0x519 with crng_init=0
[ 0.032203] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:8 nr_cpu_ids:8 nr_node_ids:1
[ 0.032301] percpu: Embedded 45 pages/cpu @(____ptrval____) s147456 r8192 d28672 u262144
[ 0.032304] pcpu-alloc: s147456 r8192 d28672 u262144 alloc=1*2097152
[ 0.032305] pcpu-alloc: [0] 0 1 2 3 4 5 6 7
[ 0.032313] Hyper-V: PV spinlocks enabled
[ 0.032315] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes)
[ 0.032317] Built 1 zonelists, mobility grouping on. Total pages: 2579833
[ 0.032318] Policy zone: Normal
[ 0.032319] Kernel command line: BOOT_IMAGE=/boot/hv/bzImage root=UUID=9bb51693-01d7-4323-a5e4-54b065696092 ro ignore_loglevel console =tty1 console=ttyS0
[ 0.034480] Calgary: detecting Calgary via BIOS EBDA area
[ 0.034482] Calgary: Unable to locate Rio Grande table in EBDA - bailing!
[ 0.051099] Memory: 9940104K/10485304K available (14341K kernel code, 2169K rwdata, 4120K rodata, 2240K init, 5704K bss, 545200K reser ved, 0K cma-reserved)
[ 0.051176] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
[ 0.051179] Kernel/User page tables isolation: enabled
[ 0.051189] ftrace: allocating 40581 entries in 159 pages
[ 0.062388] rcu: Hierarchical RCU implementation.
[ 0.062389] rcu: RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=8.
[ 0.062390] Tasks RCU enabled.
[ 0.062390] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.062391] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
[ 0.064322] NR_IRQS: 524544, nr_irqs: 488, preallocated irqs: 16
[ 0.065659] Console: colour dummy device 80x25
[ 0.065777] printk: console [tty1] enabled
[ 1.250563] printk: console [ttyS0] enabled
[ 1.255152] ACPI: Core revision 20181213
[ 1.260814] APIC: Switch to symmetric I/O mode setup
[ 1.267937] Hyper-V: Using IPI hypercalls
[ 1.272935] Hyper-V: Using MSR based APIC access
[ 1.278465] clocksource: hyperv_clocksource_tsc_page: mask: 0xffffffffffffffff max_cycles: 0x24e6a1710, max_idle_ns: 440795202120 ns
[ 1.323738] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 1.333824] Calibrating delay loop (skipped), value calculated using timer frequency.. 6816.00 BogoMIPS (lpj=13632000)
[ 1.337817] pid_max: default: 32768 minimum: 301
[ 1.341857] LSM: Security Framework initializing
[ 1.345815] Yama: becoming mindful.
[ 1.349857] AppArmor: AppArmor initialized
[ 1.355760] Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
[ 1.358833] Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
[ 1.361869] Mount-cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 1.365861] Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 1.369878] Kernel panic - not syncing: Can't create rootfs
[ 1.373811] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc7-next-20181218+ #4
[ 1.373811] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018
[ 1.373811] Call Trace:
[ 1.373811] dump_stack+0x4d/0x65
[ 1.373811] panic+0xf8/0x294
[ 1.373811] ? put_fs_context+0xdb/0x140
[ 1.373811] mnt_init+0x17c/0x1fe
[ 1.373811] ? __percpu_counter_init+0x21/0x40
[ 1.373811] vfs_caches_init+0xce/0xda
[ 1.373811] start_kernel+0x4b2/0x519
[ 1.373811] x86_64_start_reservations+0x24/0x26
[ 1.373811] x86_64_start_kernel+0x74/0x77
[ 1.373811] secondary_startup_64+0xa4/0xb0
[ 1.373811] ---[ end Kernel panic - not syncing: Can't create rootfs ]---
[ 2.345813] random: fast init done