Re: [PATCH] crypto: algapi - guard against uninitialized spawn list in crypto_remove_spawns

From: Eric Biggers
Date: Tue Jun 25 2019 - 12:40:59 EST


Hi Michal,

On Tue, Jun 25, 2019 at 09:16:24AM +0200, Michal Suchanek wrote:
> Reportedly on Linux 4.12 the LTP testsuite crashes at pcrypt_aead01 infrequently.
>
> To get it reproduce more frequently I tried
>
> n=0 ; while true ; do /opt/ltp/testcases/bin/pcrypt_aead01 >& /dev/null ; n=$(expr $n + 1) ; echo -ne $n\\r ; done
>
> but this is quite stable. However, holding ^C in the terminal where the loop is running tends to trigger the crash.
>
> The backtrace is:
>
> [ 100.615804] Unable to handle kernel paging request for data at address 0x00000000
> [ 100.615876] Faulting instruction address: 0xc000000000520e7c
> [ 100.615943] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 100.616001] SMP NR_CPUS=2048
> [ 100.616002] NUMA
> [ 100.616030] pSeries
> [ 100.616054] Modules linked in: authenc pcrypt crypto_user kvm_pr kvm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink ip_tables x_tables af_packet rtc_generic vmx_crypto ibmveth(X) gf128mul btrfs xor raid6_pq sd_mod ibmvscsi(X) scsi_transport_srp crc32c_vpmsum sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
> [ 100.616478] Supported: Yes, External
> [ 100.616509] CPU: 5 PID: 6270 Comm: pcrypt_aead01 Tainted: G 4.12.14-150.22-default #1 SLE15
> [ 100.616632] task: c000000595084d80 task.stack: c0000005be6dc000
> [ 100.616708] NIP: c000000000520e7c LR: c000000000521e3c CTR: c000000000521de0
> [ 100.616801] REGS: c0000005be6df620 TRAP: 0300 Tainted: G (4.12.14-150.22-default)
> [ 100.616906] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>
> [ 100.616912] CR: 24002844 XER: 20040000
> [ 100.617003] CFAR: c000000000008860 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
> GPR00: c0000005a331f810 c0000005be6df8a0 c00000000119aa00 c0000005a331f800
> GPR04: c0000005be6df930 c0000005be6df8c0 c0000005be6df8d0 0000000000000000
> GPR08: 7269632929290000 c0000005a331f800 0000000000000000 0000000000000000
> GPR12: c000000000521de0 c000000007a33700 00000001271a0ee0 00007fffcb9e7bb8
> GPR16: 00000001271c2d80 00000001271c2d88 00007fffcb9e7a50 00007fffcb9e7a44
> GPR20: 00007fffcb9e7a98 00007fffcb9e7a60 0000000000000010 0000000000000010
> GPR24: 0000000000000000 0000000000000000 fffffffffffff000 c0000005be6dfaf0
> GPR28: c0000005b9929d00 0000000000000c93 c0000005be6df930 c0000005be6df8e0
> [ 100.617774] NIP [c000000000520e7c] crypto_remove_spawns+0x6c/0x2e0
> [ 100.617816] LR [c000000000521e3c] crypto_unregister_instance+0x5c/0xa0
> [ 100.617881] Call Trace:
> [ 100.617903] [c0000005be6df8a0] [c0000005b9929d00] 0xc0000005b9929d00 (unreliable)
> [ 100.617971] [c0000005be6df910] [0000000000000000] (null)
> [ 100.618021] [c0000005be6df960] [d0000000098d0894] crypto_del_alg+0xdc/0x110 [crypto_user]
> [ 100.618119] [c0000005be6df990] [d0000000098d0b58] crypto_user_rcv_msg+0xe0/0x260 [crypto_user]
> [ 100.618222] [c0000005be6dfa30] [c00000000086d678] netlink_rcv_skb+0x78/0x170
> [ 100.618309] [c0000005be6dfaa0] [d0000000098d0064] crypto_netlink_rcv+0x4c/0x80 [crypto_user]
> [ 100.618407] [c0000005be6dfad0] [c00000000086cb98] netlink_unicast+0x208/0x2f0
> [ 100.618488] [c0000005be6dfb40] [c00000000086d170] netlink_sendmsg+0x380/0x440
> [ 100.618582] [c0000005be6dfbd0] [c0000000007e9ba4] sock_sendmsg+0x64/0x90
> [ 100.618650] [c0000005be6dfc00] [c0000000007eb94c] ___sys_sendmsg+0x2cc/0x330
> [ 100.618710] [c0000005be6dfd90] [c0000000007ed02c] __sys_sendmsg+0x5c/0xc0
> [ 100.618766] [c0000005be6dfe30] [c00000000000b188] system_call+0x3c/0x130
> [ 100.618822] Instruction dump:
> [ 100.618839] e9430010 83a90020 38a10020 fbe10040 fbe10048 f8c10030 f8c10038 f8a10020
> [ 100.618902] f8a10028 38030010 7fa05040 7d475378 <e90a0000> 419e0064 60000000 60000000
> [ 100.618980] ---[ end trace 60475621348ca387 ]---
>
> The code looks like this:
>
> 0xc000000000520e10 <+0>: c8 00 4c 3c addis r2,r12,200
> 0xc000000000520e14 <+4>: f0 9b 42 38 addi r2,r2,-25616
> 0xc000000000520e18 <+8>: a6 02 08 7c mflr r0
> 0xc000000000520e1c <+12>: 00 00 00 60 nop
> 0xc000000000520e20 <+16>: 79 2b ab 7c mr. r11,r5
> 0xc000000000520e24 <+20>: f0 ff c1 fb std r30,-16(r1)
> 0xc000000000520e28 <+24>: e8 ff a1 fb std r29,-24(r1)
> 0xc000000000520e2c <+28>: f8 ff e1 fb std r31,-8(r1)
> 0xc000000000520e30 <+32>: 91 ff 21 f8 stdu r1,-112(r1)
> 0xc000000000520e34 <+36>: 78 1b 69 7c mr r9,r3
> 0xc000000000520e38 <+40>: 78 23 9e 7c mr r30,r4
> 0xc000000000520e3c <+44>: 08 00 82 41 beq 0xc000000000520e44 <crypto_remove_spawns+52>
> 0xc000000000520e40 <+48>: 78 5b 69 7d mr r9,r11
> 0xc000000000520e44 <+52>: 40 00 e1 3b addi r31,r1,64
> 0xc000000000520e48 <+56>: 30 00 c1 38 addi r6,r1,48
> # 0xc000000000520e4c <+60>: 10 00 43 e9 ld r10,16(r3)
> 0xc000000000520e50 <+64>: 20 00 a9 83 lwz r29,32(r9)
> 0xc000000000520e54 <+68>: 20 00 a1 38 addi r5,r1,32
> 0xc000000000520e58 <+72>: 40 00 e1 fb std r31,64(r1)
> 0xc000000000520e5c <+76>: 48 00 e1 fb std r31,72(r1)
> 0xc000000000520e60 <+80>: 30 00 c1 f8 std r6,48(r1)
> 0xc000000000520e64 <+84>: 38 00 c1 f8 std r6,56(r1)
> 0xc000000000520e68 <+88>: 20 00 a1 f8 std r5,32(r1)
> 0xc000000000520e6c <+92>: 28 00 a1 f8 std r5,40(r1)
> 0xc000000000520e70 <+96>: 10 00 03 38 addi r0,r3,16
> & 0xc000000000520e74 <+100>: 40 50 a0 7f cmpld cr7,r0,r10
> 0xc000000000520e78 <+104>: 78 53 47 7d mr r7,r10
> * 0xc000000000520e7c <+108>: 00 00 0a e9 ld r8,0(r10)
> 0xc000000000520e80 <+112>: 64 00 9e 41 beq cr7,0xc000000000520ee4 <crypto_remove_spawns+212>
>
> #) This looks like alg->cra_users.next is loaded to r10
> &) This looks like r10 is compared with &alg->cra_users calculated on the line
> above to terminate the loop
> *) This looks like *alg->cra_users.next loaded into r8 which causes the null
> pointer dereference
>
> So the fixup needs to be applied to the first dereference of
> alg->cra_users.next as well to prevent crash.
>
> Fixes: 9a00674213a3 ("crypto: algapi - fix NULL dereference in crypto_remove_spawns()")
>
> Reported-by: chetjain@xxxxxxxxxx
> Signed-off-by: Michal Suchanek <msuchanek@xxxxxxx>
> ---
> I cannot really test if this fix is effective because the crash is some
> heisenbug that heavily depends on timing. When the bug is not triggered it does
> not really mean anything. It is also qestionable if we should be getting these
> algs with uninitialized spawns.
>
> crypto/algapi.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/crypto/algapi.c b/crypto/algapi.c
> index 313a7682cef1..82125b82ffba 100644
> --- a/crypto/algapi.c
> +++ b/crypto/algapi.c
> @@ -151,6 +151,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list,
> LIST_HEAD(top);
>
> spawns = &alg->cra_users;
> +
> + /*
> + * We may encounter an unregistered instance here, since an instance's
> + * spawns are set up prior to the instance being registered.
> + * An unregistered instance will have NULL ->cra_users.next, since
> + * ->cra_users isn't properly initialized until registration. But an
> + * unregistered instance cannot have any users, so treat it the same as
> + * ->cra_users being empty.
> + */
> + if (spawns->next == NULL)
> + return;
> +
> list_for_each_entry_safe(spawn, n, spawns, list) {
> if ((spawn->alg->cra_flags ^ new_type) & spawn->mask)
> continue;
> @@ -177,15 +189,7 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list,
> spawn->alg = NULL;
> spawns = &inst->alg.cra_users;
>
> - /*
> - * We may encounter an unregistered instance here, since
> - * an instance's spawns are set up prior to the instance
> - * being registered. An unregistered instance will have
> - * NULL ->cra_users.next, since ->cra_users isn't
> - * properly initialized until registration. But an
> - * unregistered instance cannot have any users, so treat
> - * it the same as ->cra_users being empty.
> - */
> + /* Guard against unregistered instance */
> if (spawns->next == NULL)
> break;
> }
> --
> 2.21.0
>

The stack trace shows that crypto_remove_spawns() is being called from
crypto_unregister_instance(). Therefore, the instance should already be
registered and have initialized cra_users. Now, I don't claim to understand the
spawn lists stuff that well, so I could have missed something; but if there *is*
a bug, I'd like to see a proper explanation.

Did you check whether this is actually reproducible on mainline, and not just
the SUSE v4.12 based kernel?

- Eric