Re: [PATCH 3/3] proc: simplify proc_task_readdir/first_tid paths

From: Eric W. Biederman
Date: Fri May 31 2013 - 14:13:01 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> Eric, sorry for delay.
>
> On 05/29, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@xxxxxxxxxx> writes:
>>
>> > Why the empty "." + ".." dir is bad if the task(s) has gone away after
>> > opendir?
>>
>> Because the definition of a deleted directory that you are in is that
>> getdents will return -ENOENT.
>>
>> You can reproduce this with any linux filesystem.
>> mkdir foo
>> cd foo
>> rmdir ../foo
>> strace -f ls .
>>
>> open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
>> getdents(3, 0x1851c88, 32768) = -1 ENOENT (No such file or directory)
>> close(3) = 0
>
> Heh. Indeed, vfs_readdir() checks IS_DEADDIR().
>
> Thanks.
>
> OK. But this means that even 1/3 is not 100% right, exactly because
> leader can be unhashed right before first_tid() takes rcu lock. Easy
> to fix, we should simply factor out the "nr != 0" check.
>
> And this also means that 3/3 is not right by the same reason. I'll
> make a simpler patch which only avoids the unnecessary get/put in
> proc_task_readdir().
>
> Unless we can tolerate this very unlikely rase when the leader goes
> away after initial ENOENT check at the start, of course... Or unless
> we add canceldir() which resets getdents_callback->previous so that
> we could return ENOENT after filldir() was already called ;)

A small race is fine and is fundamental to the process of readdir.

The guarantee of open+readdir+close is that all directory entries that
exited before open and after close are returned. Directory entries that
are added or removed during the open+readir+close are returned at most
once.

The important case to handle is when someone has opened the directory a
very long time ago or has chdir'd to the directory. With the result
the directory was removed before we start the readdir process entirely.

If the tasks die in the narrow window while we are inside of readdir
races are impossible to avoid.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/