Re: INFO: task hung in fuse_reverse_inval_entry

From: Dmitry Vyukov
Date: Fri Nov 02 2018 - 15:31:32 EST


On Thu, Jul 26, 2018 at 11:12 AM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> On Thu, Jul 26, 2018 at 10:44 AM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>> On Wed, Jul 25, 2018 at 11:12 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>> On Tue, Jul 24, 2018 at 5:17 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
>>> Maybe more waits in fuse need to be interruptible? E.g. request_wait_answer?
>>
>> That's an interesting aspect. Making request_wait_answer always be
>> killable would help with the issue you raise (killing set of processes
>> taking part in deadlock should resolve deadlock), but it breaks
>> another aspect of the interface.
>>
>> Namely that userspace filesystems expect some serialization from
>> kernel when performing operations. If we allow killing of a process
>> in the middle of an fs operation, then that serialization is no longer
>> there, which can break the server.
>>
>> One solution to that is to duplicate all locking in the server
>> (libfuse normally), but it would not solve the issue for legacy
>> libfuse or legacy non-libfuse servers. It would also be difficult to
>> test. Also it doesn't solve the problem of killing the server, as
>> that alone doesn't resolve the deadlock.
>
> Umm, we can actually do better. Duplicate all vfs locking in the
> fuse kernel implementation: when killing a task that has an
> outstanding request, return immediately (which results in releasing
> the VFS level lock and hence the deadlock) but hold onto our own lock
> until the reply from the userspace server comes back.
>
> Need to think about the details; this might not be easy to do this
> properly. Notably memory management locks (page->lock, mmap_sem,
> etc) are notoriously tricky.

Hi Miklos,

Any updates on this?

syzbot recently found this hang in fuse, which looks real (totally unkillable):
https://syzkaller.appspot.com/bug?id=0d08132d6dac82ae63b7b8d4a9d027d30b46167d

but this one still happens, and it's hard to tell if it's real or not:
https://syzkaller.appspot.com/bug?id=76f8203fef423375d230f14b8f5b45617ab945e2