Re: [dm-devel] clone() with CLONE_NEWNET breaks kobject_uevent_env()

From: Eric W. Biederman
Date: Fri Aug 19 2011 - 14:39:39 EST


Milan Broz <mbroz@xxxxxxxxxx> writes:

> On 08/19/2011 01:43 PM, Eric W. Biederman wrote:
>> Milan Broz <mbroz@xxxxxxxxxx> writes:
>>
>>> On 08/19/2011 11:13 AM, Eric W. Biederman wrote:
>>>> Milan Broz <mbroz@xxxxxxxxxx> writes:
>>>>
>>>> I think the proper fix is to remove the error return from
>>>> kobject_uevent_env and kobject_uevent, and make it harder to get calling
>>>> of this function wrong. Possibly in conjunction with that tag all of
>>>> the memory allocations of kobject_uevent_env with GFP_NOFAIL or
>>>> something so the memory allocator knows that this path is totally
>>>> not able to deal with failure.
>>>>
>>>> Is kobject_uevent_env anything except an asynchronous best effort
>>>> notification to user-space that a device has come or gone?
>>>
>>> Unfortunately it is for device-mapper. libdevmapper
>>> depends on information that uevent was sent because udev rules uses
>>> semaphore to inform that some action was taken.
>>> So if dm-ioctl returns flag that uevent was not sent, it fallback
>>> to different error path (otherwise it waits for completion forever).
>>> (TBH I am more and more convinced this was not quite clever concept.)
>>
>> If I understand your description and the code right the guarantee that
>> you need is that kobject_uevent will return success only if it has
>> queued a packet in every listening netlink socket.
>
> I think so. IOW success == event was sent to all active listeners.
>
>> We already ignore ENOBUFS so the guarantee you appear to need in
>> libdevmapper does not appear to be present in kobject_uevent.
>>
>> Does the libdevmapper code work despite getting a spurious failure?
>
> BTW I do not see ENOBUFS but ESRCH (from netlink_broadcast_filtered).
>
> If spurious failure is that event is sent (even partially) but it reports
> failure, it is the exact situation I see now - libdevmapper will try
> to decrement system semaphore which is already removed from udev rules.
>
> Final state is correct, just it prints ugly warnings. IOW it recovers
> from this situation correctly.

Then I guess this is fixable in kobject_uevent_env. I'm not certain
it is smart to support this case but it appears supportable.

> But Kay's suggestion to use netlink_has_listeners() seems like good
> idea. IOW if there is no listener, it should skip quietly and not
> fail the whole call...

In the case of ESRCH I completely agree.

We are currently ignoring errors in the semantically more interesting
case when netlink_broadcast does not deliver the packet to one of the
listening netlink sockets.

How does this patch look?
---

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index 70af0a7..7da5ef3 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -139,6 +139,7 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
u64 seq;
int i = 0;
int retval = 0;
+ bool delivery_failed;
#ifdef CONFIG_NET
struct uevent_sock *ue_sk;
#endif
@@ -251,6 +252,7 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
if (retval)
goto exit;

+ delivery_failure = false;
#if defined(CONFIG_NET)
/* send netlink message */
mutex_lock(&uevent_sock_mutex);
@@ -281,14 +283,15 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
0, 1, GFP_KERNEL,
kobj_bcast_filter,
kobj);
- /* ENOBUFS should be handled in userspace */
- if (retval == -ENOBUFS)
- retval = 0;
+ if (retval && (retval != -ESRCH))
+ delivery_failure = true;
} else
- retval = -ENOMEM;
+ delivery_failure = true;
}
mutex_unlock(&uevent_sock_mutex);
#endif
+ if (delivery_failure)
+ retval = -ENOBUFS;

/* call uevent_helper, usually only enabled during early boot */
if (uevent_helper[0] && !kobj_usermode_filter(kobj)) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/