Re: mapping user space buffer to kernel address space

From: Linus Torvalds (torvalds@transmeta.com)
Date: Mon Oct 16 2000 - 22:11:55 EST


On Tue, 17 Oct 2000, Andrea Arcangeli wrote:
>
> If you won't delete map_user_kiobuf from your tree I think I've just provided a
> real world MM corruption case where the user send the bug report back to us if
> we only increase the reference count of the page to pin it.

Oh. So to fix a bug, you say "either delete the code, or do something else
that is completely idiotic instead"?

Sure, that's sensible. NOT.

Andrea, explain to me how pinning _could_ work? Explain to me how you'd
lock down pages in virtual address space with multiple threads, and how
you'd handle the cases of:

 - two threads doing direct IO from different parts of the same page
 - one thread starting IO from a page, another thread unmapping the range

Basically, you can't handle it sanly, because the notion of virtual
pinning really isn't a sane notion. The first case would need a special
"pinning count". Which is too expensive to be an option, although I've
seen patches that seemed to do something like that - I consider the whole
notion idiotic.

The second case would require that unmap() synchronize completely with the
IO. Which is wasteful, and doesn't make any sense: what's the point, when
you can avoid it by just not pinning?

Neither option is, quite frankly, acceptable.

So we're left with your suggestion to remove direct IO completely.
Something that I wouldn't mind horribly much, but too many people seem to
consider it worth-while - and while I've stubborny fought the direct-IO
patches a long time, every single technical argument I've had has been
successfully addressed over time.

I'm sure this bug will get fixed too. And the fix probably won't end up
even being all that painful - it's probably a question of marking the page
dirty after completing IO into it and making sure the swap-out logic does
the right thing (ie try to write it out again - which is exactly the same
thing that happens right now if a user dirties a page while it's busy
doing write-out).

In fact, the code may do this already, I'll let sct look into the exact
details and fix it.

                Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Oct 23 2000 - 21:00:10 EST