Re: [PATCHv2] exec: Fix a deadlock in ptrace

From: Bernd Edlinger
Date: Mon Mar 02 2020 - 17:00:50 EST


On 3/2/20 10:49 PM, Eric W. Biederman wrote:
> Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> writes:
>
>> On 3/2/20 5:17 PM, Eric W. Biederman wrote:
>>> Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> writes:
>>>
>>>> On 3/2/20 4:57 PM, Eric W. Biederman wrote:
>>>>> Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> writes:
>>>>>
>>>>>>
>>>>>> I tried this with s/EACCESS/EACCES/.
>>>>>>
>>>>>> The test case in this patch is not fixed, but strace does not freeze,
>>>>>> at least with my setup where it did freeze repeatable.
>>>>>
>>>>> Thanks, That is what I was aiming at.
>>>>>
>>>>> So we have one method we can pursue to fix this in practice.
>>>>>
>>>>>> That is
>>>>>> obviously because it bypasses the cred_guard_mutex. But all other
>>>>>> process that access this file still freeze, and cannot be
>>>>>> interrupted except with kill -9.
>>>>>>
>>>>>> However that smells like a denial of service, that this
>>>>>> simple test case which can be executed by guest, creates a /proc/$pid/mem
>>>>>> that freezes any process, even root, when it looks at it.
>>>>>> I mean: "ln -s README /proc/$pid/mem" would be a nice bomb.
>>>>>
>>>>> Yes. Your the test case in your patch a variant of the original
>>>>> problem.
>>>>>
>>>>>
>>>>> I have been staring at this trying to understand the fundamentals of the
>>>>> original deeper problem.
>>>>>
>>>>> The current scope of cred_guard_mutex in exec is because being ptraced
>>>>> causes suid exec to act differently. So we need to know early if we are
>>>>> ptraced.
>>>>>
>>>>
>>>> It has a second use, that it prevents two threads entering execve,
>>>> which would probably result in disaster.
>>>
>>> Exec can fail with an error code up until de_thread. de_thread causes
>>> exec to fail with the error code -EAGAIN for the second thread to get
>>> into de_thread.
>>>
>>> So no. The cred_guard_mutex is not needed for that case at all.
>>>
>>
>> Okay, but that will reset current->in_execve, right?
>
> Absolutely.
>
> The error handling kicks in and exec_binprm fails with a negative
> return code. Then __do_excve_file cleans up and clears
> current->in_execve.
>

Yes of course. I was under the wrong impression that that value is
a kind of global, but it is a thread local.

So I think I need a new boolean see v3 of the patch, and soon v4 (with
just one comment fixed).

I'm currently executing the strace v5.5 testsuite, and every test
is passed so far. I'll also look at gdb testsuite, before I send the
next version.


Thanks
Bernd.