[PATCH] fs: Fix ioctx lookup races with io_destroy() in AIO

From: Jan Kara
Date: Tue Jan 18 2011 - 18:37:48 EST


AIO code doesn't implement the rcu lookup pattern properly.
rcu_read_lock in lookup_ioctx() does not prevent refcount
going to zero, so we might take a refcount on a zero count
(possibly already freed) ioctx:

CPU1 CPU2
lookup_ioctx()
hlist_for_each_entry_rcu()
if (ctx->user_id == ctx_id && !ctx->dead)
io_destroy()
free ioctx
get_ioctx(ctx);
and return freed memory to innocent caller

Close this race by calling synchronize_rcu() in io_destroy().

Another race occurs when io_submit() races with io_destroy():

CPU1 CPU2
io_submit()
do_io_submit()
...
ctx = lookup_ioctx(ctx_id);
io_destroy()
Now do_io_submit() holds the last reference to ctx.
...
queue new AIO
put_ioctx(ctx) - frees ctx with active AIOs

We solve this issue by checking whether ctx is being destroyed
in AIO submission path after adding new AIO to ctx. Then we
are guaranteed that either io_destroy() waits for new AIO or
we see that ctx is being destroyed and bail out.

Signed-off-by: Jan Kara <jack@xxxxxxx>
---
fs/aio.c | 21 +++++++++++++++++++++
1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 8c8f6c5..59ac692 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1225,6 +1225,17 @@ static void io_destroy(struct kioctx *ioctx)
if (likely(!was_dead))
put_ioctx(ioctx); /* twice for the list */

+ /*
+ * After this ioctx cannot be looked up because ioctx->dead is set.
+ * But there could be still io_submit() running...
+ */
+ synchronize_rcu();
+
+ /*
+ * ... but once these functions finish, the io has been either
+ * cancelled (if aio_get_req() has already run) or the ctx->dead
+ * check in io_submit_one() triggers and no io is submitted.
+ */
aio_cancel_all(ioctx);
wait_for_all_aios(ioctx);

@@ -1646,6 +1657,16 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
goto out_put_req;

spin_lock_irq(&ctx->ctx_lock);
+ /*
+ * Raced with io_destroy()? See comments in io_destroy() for details.
+ * The check is inside ctx->ctx_lock to avoid extra memory barrier
+ * in this fast path...
+ */
+ if (ctx->dead) {
+ spin_unlock_irq(&ctx->ctx_lock);
+ ret = -EINVAL;
+ goto out_put_req;
+ }
aio_run_iocb(req);
if (!list_empty(&ctx->run_list)) {
/* drain the run list */
--
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/