Re: [RFC] sunrpc: Fix race between work-queue and rpc_killall_tasks.

From: Ben Greear
Date: Tue Jul 12 2011 - 13:15:40 EST


On 07/08/2011 03:14 PM, Myklebust, Trond wrote:

[<ffffffff81105907>] print_trailer+0x131/0x13a
[<ffffffff81105945>] object_err+0x35/0x3e
[<ffffffff811077b3>] verify_mem_not_deleted+0x7a/0xb7
[<ffffffffa02891e5>] rpcb_getport_done+0x23/0x126 [sunrpc]
[<ffffffffa02810df>] rpc_exit_task+0x3f/0x6d [sunrpc]
[<ffffffffa02814d8>] __rpc_execute+0x80/0x253 [sunrpc]
[<ffffffffa02816ed>] ? rpc_execute+0x42/0x42 [sunrpc]
[<ffffffffa02816fd>] rpc_async_schedule+0x10/0x12 [sunrpc]
[<ffffffff81061343>] process_one_work+0x230/0x41d
[<ffffffff8106128e>] ? process_one_work+0x17b/0x41d
[<ffffffff8106379f>] worker_thread+0x133/0x217
[<ffffffff8106366c>] ? manage_workers+0x191/0x191
[<ffffffff81066f9c>] kthread+0x7d/0x85
[<ffffffff81485ee4>] kernel_thread_helper+0x4/0x10
[<ffffffff8147f0d8>] ? retint_restore_args+0x13/0x13
[<ffffffff81066f1f>] ? __init_kthread_worker+0x56/0x56
[<ffffffff81485ee0>] ? gs_change+0x13/0x13

The calldata gets freed in the rpc_final_put_task() which shouldn't ever be run while the task is still referenced in __rpc_execute

IOW: it should be impossible to call rpc_exit_task() after rpc_final_put_task

I added lots of locking around the calldata, work-queue logic, and such, and
still the problem persists w/out hitting any of the debug warnings or poisoned
values I put in. It almost seems like tk_calldata is just assigned to two
different tasks.

While poking through the code, I noticed that 'map' is static in rpcb_getport_async.

That would seem to cause problems if two threads called this method at
the same time, possibly causing tk_calldata to be assigned to two different
tasks???

Any idea why it is static?

I'm going to start another test run with this non-static
to see if that resolves things...

Thanks,
Ben

--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/