Re: [PATCH v3] net: atlantic: always deep reset on pm op, fixing null deref regression

From: Manuel Ullmann
Date: Sat May 07 2022 - 10:00:04 EST


Thorsten Leemhuis <regressions@xxxxxxxxxxxxx> writes:

> On 07.05.22 15:10, Manuel Ullmann wrote:
>> Thorsten Leemhuis <regressions@xxxxxxxxxxxxx> writes:
>>
>>> On 06.05.22 00:09, Manuel Ullmann wrote:
>>>> >From d24052938345d456946be0e9ccc337e24d771c79 Mon Sep 17 00:00:00 2001
>>>> Date: Wed, 4 May 2022 21:30:44 +0200
>>>>
>>>> The impact of this regression is the same for resume that I saw on
>>>> thaw: the kernel hangs and nothing except SysRq rebooting can be done.
>>>>
>>>> The null deref occurs at the same position as on thaw.
>>>> BUG: kernel NULL pointer dereference
>>>> RIP: aq_ring_rx_fill+0xcf/0x210 [atlantic]
>>>>
>>>> Fixes regression in commit cbe6c3a8f8f4 ("net: atlantic: invert deep
>>>> par in pm functions, preventing null derefs"), where I disabled deep
>>>> pm resets in suspend and resume, trying to make sense of the
>>>> atl_resume_common deep parameter in the first place.
>>>>
>>>> It turns out, that atlantic always has to deep reset on pm operations
>>>> and the parameter is useless. Even though I expected that and tested
>>>> resume, I screwed up by kexec-rebooting into an unpatched kernel, thus
>>>> missing the breakage.
>>>>
>>>> This fixup obsoletes the deep parameter of atl_resume_common, but I
>>>> leave the cleanup for the maintainers to post to mainline.
>>>
>>> FWIW, this section starting here and...
>>>
>>>> PS: I'm very sorry for this regression.
>>>>
>>>> Changes in v2:
>>>> Patch formatting fixes
>>>> - Fix Fixes tag
>>>> – Simplify stable Cc tag
>>>> – Fix Signed-off-by tag
>>>>
>>>> Changes in v3:
>>>> – Prefixed commit reference with "commit" aka I managed to use
>>>> checkpatch.pl.
>>>> - Added Tested-by tags for the testing reporters.
>>>> – People start to get annoyed by my patch revision spamming. Should be
>>>> the last one.
>>>
>>> ...ending here needs should be below the "---" line you already have
>>> below. For details see:
>>> https://www.kernel.org/doc/html/latest/process/submitting-patches.html
>
> Sorry, I caused a misunderstanding. I didn't handle the above, I'm not
> one of the net subsystem developers. Either you submit a v4 fixing this
> or hope the net maintainer take care of that when they look at it -- but
> I guess they would highly prefer it if you'd address this.

Never mind. Then I’ll post a v4. Thanks for handling the regzbot
tracking. I indeed just assumed this already to be correctly regzbot
tracked. Never post in a hurry.

>>> BTW, same goes for any "#regzbot" commands (like you had in
>>> cbe6c3a8f8f4), as things otherwise get confused when a patch for example
>>> is posted as part of a stable/longterm -rc review.
>> Good to know. Maybe I could patch the handling-regressions documentation
>> to include this.
>
> Yeah, I have already thought about it, but didn't get down to it yet.

Well, I could try it eventually.

> Only so much hours in a day.

I know that issue. ;)
>> submitting-patches could also link the subsystem
>> specific documentation, e.g. the netdev FAQ, since they handle patches
>> with their more bot tests. Would have helped me a bit. Might be a nice
>> exercise for properly formatted patching ;)
>
> I agree that the docs for submitting patches could need a few
> improvements and that is likely one of them.

Then I’ll try fixing this, too. After all most devs have scarce time for
documentation.

>>> But don't worry, no big deal, I handled that :-D Many thx for actually
>>> directly getting regzbot involved and taking care of this regression!
>> Thank you for the final cleanup and you’re welcome. :) Where did you
>> handle this? I can’t seem to find the fixup anywhere, i.e. net-next,
>> net, linux-next or lkml.
>
> See above, I only handled the regzbot issue, not the issue with this
> patch. Sorry for not being clear enough in my wording.

Thanks for clearing this up.

Regards, Manuel