Re: linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected)

From: Tetsuo Handa
Date: Thu Jul 02 2020 - 20:54:10 EST


On 2020/07/03 4:46, Luis Chamberlain wrote:
> On Thu, Jul 02, 2020 at 01:26:53PM +0900, Tetsuo Handa wrote:
>> On 2020/07/02 0:38, Luis Chamberlain wrote:
>>> @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>>> */
>>> if (KWIFEXITED(ret))
>>> sub_info->retval = KWEXITSTATUS(ret);
>>> + /*
>>> + * Do we really want to be passing the signal, or do we pass
>>> + * a single error code for all cases?
>>> + */
>>> + else if (KWIFSIGNALED(ret))
>>> + sub_info->retval = KWTERMSIG(ret);
>>
>> No, this is bad. Caller of usermode helper is unable to distinguish exit(9)
>> and e.g. SIGKILL'ed by the OOM-killer.
>
> Right, the question is: do we care?

Yes, we have to care.

> And the umh patch "umh: fix processed error when UMH_WAIT_PROC is used"
> changed this to:
>
> - if (ret >= 0) {
> + if (ret != 0) {
>
> Prior to the patch negative return values from userspace were still
> being captured, and likewise signals, but the error value was not
> raw, not the actual value. After the patch, since we check for ret != 0
> we still upkeep the sanity check for any error, correct the error value,
> but as you noted signals were ignored as I made the wrong assumption
> we would ignore them. The umh sub_info->retval is set after my original
> patch only if KWIFSIGNALED(ret)), and ignored signals, and so that
> would be now capitured with the additional KWIFSIGNALED(ret)) check.

"call_usermodehelper_keys() == 0" (i.e. usermode helper was successfully
started and successfully terminated via exit(0)) is different from "there is
nothing to do". call_sbin_request_key() == 0 case still has to check for
possibility of -ENOKEY case.

>
> The question still stands:
>
> Do we want to open code all these checks or simply wrap them up in
> the umh. If we do the later, as you note exit(9) and a SIGKILL will
> be the same to the inspector in the kernel. But do we care?

Yes, we do care.

>
> Do we really want umh callers differntiatin between signals and exit values?

Yes, we do.

>
> The alternative to making a compromise is using generic wrappers for
> things which make sense and letting the callers use those.

I suggest just introducing KWIFEXITED()/KWEXITSTATUS()/KWIFSIGNALED()/KWTERMSIG()
macros and fixing the callers, for some callers are not aware of possibility of
KWIFSIGNALED() case.

For example, conn_try_outdate_peer() in drivers/block/drbd/drbd_nl.c misbehaves if
drbd_usermode_helper process was terminated by a signal, for the switch() statement
after returning from conn_helper() is assuming that the return value of conn_helper()
is a KWEXITSTATUS() value if drbd_usermode_helper process was successfully started.
If drbd_usermode_helper process was terminated by SIGQUIT (which is 3),
conn_try_outdate_peer() will by error hit "case P_INCONSISTENT:" (which is 3);
conn_try_outdate_peer() should hit "default: /* The script is broken ... */"
unless KWIFEXITED() == true.

Your patch is trying to obnubilate the return code.