[PATCH] radix tree: Don't return internal entries on lookup

From: Greg Kurz
Date: Thu Dec 06 2018 - 03:10:27 EST


Commit 66ee620f06f9 ("idr: Permit any valid kernel pointer to
be stored") changed the radix tree lookup so that it stops when
reaching the bottom of the tree. But radix_tree_descend() may have
changed the node variable to point to an internal entry which then
gets returned to the caller and bad things may happen.

For example, the OCXL driver can easily crash when running tests
that cause radix_tree_lookup() to return RADIX_TREE_RETRY at some
point instead of NULL or a valid pointer, as shown below:

Unable to handle kernel paging request for data at address 0x00000406
Faulting instruction address: 0xc0000000006f6ca0
Oops: Kernel access of bad area, sig: 7 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in:
CPU: 10 PID: 0 Comm: swapper/10 Not tainted 4.20.0-rc5-fxb-dirty #1
NIP: c0000000006f6ca0 LR: c0000000006f6c90 CTR: c0000000006f6b80
REGS: c0000007bff23a80 TRAP: 0300 Not tainted (4.20.0-rc5-fxb-dirty)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48004422 XER: 20040000
CFAR: c000000000aebbe4 DAR: 0000000000000406 DSISR: 00080000 IRQMASK: 1
GPR00: c000000712ff0a38 c0000007bff23d10 c000000001025a00 0000000000000406
GPR04: 0000000000000000 0000000000000000 0000000000000000 c0000007182f8d70
GPR08: 0000000000000406 c0000007182f8d98 00000000004d0008 fffffffffffffffd
GPR12: 0000000000000040 c0000007bfff4200 c0000000047d7f90 000000000a205200
GPR16: 0000000000000000 0000000000000001 0000000000000008 c000000000044480
GPR20: c000000001060174 0000000000000001 0000000000000002 0000000000000001
GPR24: 0000000000000097 c00000002d613000 00007fff92230000 10000000020000b4
GPR28: c00000002d800000 0000000000000000 000000004d000800 c000000712ff0a00
NIP [c0000000006f6ca0] xsl_fault_handler+0x120/0x310
LR [c0000000006f6c90] xsl_fault_handler+0x110/0x310
Call Trace:
[c0000007bff23d10] [c0000000006f6d7c] xsl_fault_handler+0x1fc/0x310 (unreliable)
[c0000007bff23da0] [c000000000164e8c] __handle_irq_event_percpu+0x9c/0x2f0
[c0000007bff23e60] [c000000000165124] handle_irq_event_percpu+0x44/0xc0
[c0000007bff23ea0] [c000000000165204] handle_irq_event+0x64/0xb0
[c0000007bff23ed0] [c00000000016af68] handle_fasteoi_irq+0xd8/0x230
[c0000007bff23f00] [c00000000016356c] generic_handle_irq+0x4c/0x70
[c0000007bff23f20] [c00000000001a394] __do_irq+0xa4/0x210
[c0000007bff23f90] [c00000000002e3f4] call_do_irq+0x14/0x24
[c0000000047d7a30] [c00000000001a59c] do_IRQ+0x9c/0x130
[c0000000047d7a80] [c000000000009ce4] h_virt_irq_common+0x124/0x130
--- interrupt: ea1 at snooze_loop+0x138/0x280
LR = snooze_loop+0x260/0x280
[c0000000047d7d80] [c0000000010600b0] cpu_idle_force_poll+0x0/0x8 (unreliable)
[c0000000047d7dc0] [c000000000904b98] cpuidle_enter_state+0xa8/0x4c0
[c0000000047d7e30] [c00000000012c3dc] call_cpuidle+0x4c/0x80
[c0000000047d7e50] [c00000000012c98c] do_idle+0x2cc/0x370
[c0000000047d7ec0] [c00000000012cc88] cpu_startup_entry+0x38/0x40
[c0000000047d7ef0] [c000000000046b54] start_secondary+0x664/0x670
[c0000000047d7f90] [c00000000000ac70] start_secondary_prolog+0x10/0x14
Instruction dump:
418201c8 7ba93e24 7fa4eb78 387f0030 fbc10080 7f9c4a14 83dc003c 483f4fed
60000000 7c681b79 41820058 57cac03e <e9280000> 53ca421e 53ca463e 7d4707b4
---[ end trace d17a5522f20aacd2 ]---

Kernel panic - not syncing: Aiee, killing interrupt handler!

Code is:

static irqreturn_t xsl_fault_handler(int irq, void *data)
{
...
rcu_read_lock();
pe_data = radix_tree_lookup(&spa->pe_tree, pe_handle);
if (!pe_data) {
...
}
WARN_ON(pe_data->mm->context.id != pid);

pe_data is equal to R3 which is the return value of radix_tree_lookup(),
ie, 0x406 == RADIX_TREE_RETRY.

Fix this by not breaking out of the loop if the node points to an internal
entry.

Fixes: 66ee620f06f9 ("idr: Permit any valid kernel pointer to be stored")
Signed-off-by: Greg Kurz <groug@xxxxxxxx>
---
lib/radix-tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 1106bb6aa01e..462a71962a0e 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -789,7 +789,7 @@ void *__radix_tree_lookup(const struct radix_tree_root *root,
parent = entry_to_node(node);
offset = radix_tree_descend(parent, &node, index);
slot = parent->slots + offset;
- if (parent->shift == 0)
+ if (!radix_tree_is_internal_node(node) && parent->shift == 0)
break;
}