Re: Use of copy_from_user in msm_gem_submit.c while holding a spin_lock

From: Rob Clark
Date: Wed Aug 17 2016 - 14:49:40 EST


On Wed, Aug 17, 2016 at 1:08 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Aug 17, 2016 at 11:08:46AM -0400, Rob Clark wrote:
>> On Wed, Aug 17, 2016 at 7:40 AM, Vaishali Thakkar
>> <vaishali.thakkar@xxxxxxxxxx> wrote:
>> > Hello,
>> >
>> > I was wondering about the call to copy_from_user in function submit_lookup_objects for drive
>> > /gpu/drm/msm/msm_gem_submit.c It calls copy_from_user[1] in a spin_lock, which is not normally
>> > allowed, due to the possibility of a deadlock.
>> >
>> > Is there some reason that I am overlooking why it is OK in this case? Is there some code in the
>> > same file which ensures that page fault will not occur when we are calling the function holding
>> > spin_lock?
>>
>> hmm, probably just that it isn't typical to use a swap file on these
>> devices (and that lockdep/etc doesn't warn about it).. I guess we
>> probably need some sort of slow-path where we drop the lock and try
>> again in case there would be a fault..
>
> Sigh... Folks, you don't need swap *at* *all* for copy_from_user() to block.
> /* get a zero-filled 64K buffer */
> addr = mmap(NULL, 65536, PROT_READ | PROT_WRITE,
> MAP_ANONYMOUS | MAP_SHARED, -1, 0);
> if (addr < 0)
> piss off
> buffer = (void *)addr;
> ....
> pass buf to a syscall


Sure, I know that.. but if you pass random garbage cmstream to the
gpu, it will crash (the gpu) too and/or result in corrupt rendering on
screen, etc. GPU submit APIs don't exist for random end users, they
exist for one user that knows what it is doing (ie. mesa).

I'm not saying that I shouldn't fix it (although not quite sure how
yet.. taking/dropping the spinlock inside the loop is not a good
option from a performance standpoint). What I am saying is that this
is not something that can happen accidentally (as it could in the case
of swap). But I agree that I should fix it somehow to avoid issues
with an intentionally evil userspace.

If there is a copy_from_user() variant that will return an error
instead of blocking, I think that is really what I want so I can
implement a slow-path that drops the spin-lock temporarily.

BR,
-R


> and copy_from_user() in that syscall will have to allocate pages (and possibly
> page tables as well). Which can block just fine, no swap involved. Moreover,
> if you modify some parts of the buffer first, you will get the pages containing
> those modifications already present, but anything still untouched will
> a) act as if it had been zeroed first and
> b) possibly block on the first dereference, be it from kernel or from
> userland. Worse yet, there's nothing to stop libc from using the above for
> calloc() and its ilk, with your application having no way to tell. As far
> as application is concerned, it has asked a library function to allocate and
> zero a piece of memory, got one and yes, it does appear to be properly zeroed.
>
> The bottom line is, copy_from_user() can realistically block, without
> anything fishy going on in the userland setup.