On Fri, 2004-09-03 at 08:52, Badari Pulavarty wrote:
On Tue, 2004-08-31 at 08:18, Andrew Morton wrote:
Begin forwarded message:Hi Andrew,
Date: Tue, 31 Aug 2004 06:15:18 -0700
From: bugme-daemon@xxxxxxxx
To: bugme-new@xxxxxxxxxxxxxx
Subject: [Bugme-new] [Bug 3317] New: Kernel oops in aio_complete while running AIO application
http://bugme.osdl.org/show_bug.cgi?id=3317
I debugged this some more. Here is whats happening:
The test program used program text address as buffer to do the READ to.
DIO get_user_pages() returned EFAULT. We called finished_one_bio()
as part of dropping the ref. to dio. It called aio_complete().
do_direct_IO() returned EFAULT to the caller. aio_run_iocb() expects
to see EIOCBQUEUED/RETRY, otherwise it calls aio_complete() with the
"ret" value. This is where the second aio_complete() is coming from.
So we cleanup "req" and on the next de-ref we get OOPS.
The problem here is, finished_one_bio() shouldn't call aio_complete()
since no work has been done. I have a fix for this - can you verify this
? I am not really comfortable with this "tweaking". (I am not really
sure about IO errors like EIO etc. - if they can lead to calling
aio_complete() twice)
Fix is to call aio_complete() ONLY if there is something to report.
Note the we don't update dio->result with any error codes from
get_user_pages(), they just passed as "ret" value from do_direct_IO().
Thanks,
Badari
Badari,
This does fix the problem when running on my system (ext3).
One question, finished_one_bio() is called in 3 places,
are you sure the other places won't be harmed by this
change?
I'm also looking over the code and will let you know if
I see any problems.
Daniel
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@xxxxxxxxxx For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@xxxxxxxxx">aart@xxxxxxxxx</a>