Re: [Letux-kernel] BUG: drivers/pinctrl/core: races in pinctrl_groups and deferred probing

From: H. Nikolaus Schaller
Date: Mon Jun 18 2018 - 06:00:07 EST



> Am 18.06.2018 um 11:54 schrieb Tony Lindgren <tony@xxxxxxxxxxx>:
>
> * H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> [180618 09:32]:
>>
>>> Am 18.06.2018 um 11:14 schrieb Tony Lindgren <tony@xxxxxxxxxxx>:
>>>
>>> * Andy Shevchenko <andy.shevchenko@xxxxxxxxx> [180618 08:25]:
>>>> On Sat, Jun 16, 2018 at 2:08 PM H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote:
>>>>> But it looks as if we still have duplicate assignments by deferred probing, i.e. some cleanup is
>>>>> missing (or is this intended behaviour?).
>>>>
>>>>> But I think the fundamental problem is that the same driver assigns multiple slots if
>>>>> probing is deferred.
>>>>
>>>> Indeed.
>>>>
>>>> I think there is a simple way to clean up pinctrl stuff on failed probe. See
>>>> https://elixir.bootlin.com/linux/v4.18-rc1/source/drivers/base/dd.c#L416
>>>>
>>>> We only bind pins, and do not perform any actions when failure happens later on.
>>>
>>> Yup seems like a good approach. I'll take a look if we can just
>>> check if the function or group name already exists and return
>>> the existing selector in that case.
>>
>> Ok, that would also solve the duplication issue.
>
> Below is an incremental patch to check for existing entries.

Thanks!

> Care
> to test again?

Yes, asap. It is the most critical bug I currently know for all our OMAP
devices...

>
> If that works, I'll fold it into the patch series and repost the
> whole series.
>
>> On the other hand we still have a stale entry if the probing process
>> finally fails after several attempts.
>>
>> This may happen if a driver with a valid DT entry is blacklisted in
>> /etc/modprobe.d/blacklist.conf. Then, the kernel will try to modprobe
>> it several times through udev until it gives up. The reason seems to
>> be that the deferred probing thread does not know why the driver did
>> not probe successfully.
>
> Hmm I think this might be fixed then too. Then on pinctrl module
> unbind/unload we should free the radix tree entries if that is
> not yet done. Seems we may only free them on -ENOMEM right now.

Yes, that could do proper cleanup.

I think all the issues were introduced by deferred probing and the
pinctrl code is safe if that does not exist... So we probably should
think about backporting to stable. But let's test it outside the
trees and have it mature in linux-next for a while.

BR and thanks,
Nikolaus