Re: [PATCH v4] livepatch: introduce shadow variable API

From: Joe Lawrence
Date: Thu Aug 17 2017 - 12:01:42 EST


On 08/17/2017 10:05 AM, Petr Mladek wrote:
> On Mon 2017-08-14 16:02:43, Joe Lawrence wrote:
>> Add exported API for livepatch modules:
>>
>> klp_shadow_get()
>> klp_shadow_attach()
>> klp_shadow_get_or_attach()
>> klp_shadow_update_or_attach()
>> klp_shadow_detach()
>> klp_shadow_detach_all()
>>
>> that implement "shadow" variables, which allow callers to associate new
>> shadow fields to existing data structures. This is intended to be used
>> by livepatch modules seeking to emulate additions to data structure
>> definitions.
>>
>> See Documentation/livepatch/shadow-vars.txt for a summary of the new
>> shadow variable API, including a few common use cases.
>>
>> See samples/livepatch/livepatch-shadow-* for example modules that
>> demonstrate shadow variables.
>>
>> diff --git a/kernel/livepatch/shadow.c b/kernel/livepatch/shadow.c
>> new file mode 100644
>> index 000000000000..0ebd4b635e4f
>> --- /dev/null
>> +++ b/kernel/livepatch/shadow.c
>> +/**
>> + * klp_shadow_match() - verify a shadow variable matches given <obj, id>
>> + * @shadow: shadow variable to match
>> + * @obj: pointer to parent object
>> + * @id: data identifier
>> + *
>> + * Return: true if the shadow variable matches.
>> + *
>> + * Callers should hold the klp_shadow_lock.
>> + */
>> +static inline bool klp_shadow_match(struct klp_shadow *shadow, void *obj,
>> + unsigned long id)
>> +{
>> + return shadow->obj == obj && shadow->id == id;
>> +}
>
> Do we really need this function? It is called only in situations
> where shadow->obj == obj is always true. Especially the use in
> klp_shadow_detach_all() is funny because we pass shadow->obj as
> the shadow parameter.

Personal preference. Abstracting out all of the routines that operated
on the shadow variables (setting up, comparison) did save some code
lines and centralized these common bits.

>> +
>> +/**
>> + * klp_shadow_get() - retrieve a shadow variable data pointer
>> + * @obj: pointer to parent object
>> + * @id: data identifier
>> + *
>> + * Return: the shadow variable data element, NULL on failure.
>> + */
>> +void *klp_shadow_get(void *obj, unsigned long id)
>> +{
>> + struct klp_shadow *shadow;
>> +
>> + rcu_read_lock();
>> +
>> + hash_for_each_possible_rcu(klp_shadow_hash, shadow, node,
>> + (unsigned long)obj) {
>> +
>> + if (klp_shadow_match(shadow, obj, id)) {
>> + rcu_read_unlock();
>> + return shadow->data;
>> + }
>> + }
>> +
>> + rcu_read_unlock();
>> +
>> + return NULL;
>> +}
>> +EXPORT_SYMBOL_GPL(klp_shadow_get);
>> +
>> +/*
>> + * klp_shadow_set() - initialize a shadow variable
>> + * @shadow: shadow variable to initialize
>> + * @obj: pointer to parent object
>> + * @id: data identifier
>> + * @data: pointer to data to attach to parent
>> + * @size: size of attached data
>> + *
>> + * Callers should hold the klp_shadow_lock.
>> + */
>> +static inline void klp_shadow_set(struct klp_shadow *shadow, void *obj,
>> + unsigned long id, void *data, size_t size)
>> +{
>> + shadow->obj = obj;
>> + shadow->id = id;
>> +
>> + if (data)
>> + memcpy(shadow->data, data, size);
>> +}
>
> The function name suggests that it is a counterpart of
> klp_shadow_get() but it is not. Which is a bit confusing.
I should have called it "setup" or "init", but perhaps that's moot ...

> Hmm, the purpose of this function is to reduce the size of cut&pasted
> code between all that klp_shadow_*attach() variants. But there
> is still too much cut&pasted code. In fact, the base logic of all
> variants is basically the same. The only small difference should be
> how they handle the situation when the variable is already there.

... this is true. An earlier draft revision that I had discarded
attempted combining all cases. I had used two extra function arguments,
"update" and "duplicates", to key off for each behavior... it turned
into a complicated, full-screen page of conditional logic, so I threw it
out.

However, I like the way you pulled it off using a jump-to-switch
statement at the bottom of the function...

> OK, there is a locking difference in the update variant but
> it is questionable, see below.
>
> I would suggest to do something like this:
>
> static enum klp_shadow_attach_existing_handling {
> KLP_SHADOW_EXISTING_RETURN,
> KLP_SHADOW_EXISTING_WARN,
> KLP_SHADOW_EXISING_UPDATE,
> };
>
> void *__klp_shadow_get_or_attach(void *obj, unsigned long id, void *data,
> size_t size, gfp_t gfp_flags,
> enum klp_shadow_attach_existing_handling existing_handling)
> {
> struct klp_shadow *new_shadow;
> void *shadow_data;
> unsigned long flags;
>
> /* Check if the shadow variable if <obj, id> already exists */
> shadow_data = klp_shadow_get(obj, id);
> if (shadow_data)
> goto exists;
>
> /* Allocate a new shadow variable for use inside the lock below */
> new_shadow = kzalloc(size + sizeof(*new_shadow), gfp_flags);
> if (!new_shadow) {
> pr_error("failed to allocate shadow variable <0x%p, %ul>\n",
> obj, id);
> return NULL;
> }
>
> new_shadow->obj = obj;
> new_shadow->id = id;
>
> /* initialize the shadow variable if if data provided */
> if (data)
> memcpy(new_shadow->data, data, size);
>
> /* Look for <obj, id> again under the lock */
> spin_lock_irqsave(&klp_shadow_lock, flags);
> shadow_data = klp_shadow_get(obj, id);
> if (unlikely(shadow_data)) {
> /*
> * Shadow variable was found, throw away speculative
> * allocation and update/return the existing one.
> */
> spin_unlock_irqrestore(&klp_shadow_lock, flags);
> kfree(new_shadow);
> goto exists;
> }
>
> /* No <obj, id> found, so attach the newly allocated one */
> hash_add_rcu(klp_shadow_hash, &new_shadow->node,
> (unsigned long)new_shadow->obj);
> spin_unlock_irqrestore(&klp_shadow_lock, flags);
>
> return new_shadow->data;
>
> exists:
> switch(existing_handling) {
> case KLP_SHADOW_EXISTING_RETURN:
> break;
>
> case KLP_SHADOW_EXISTING_WARN:
> WARN(1, "Duplicate shadow variable <%p, %lx>\n", obj, id);
> shadow_data = NULL;
> break;
>
> case KLP_SHADOW_EXISING_UPDATE:
> if (data)
> memcpy(shadow_data, data, size);
> break;
> }
> return shadow_data;
> }
>
> void *klp_shadow_attach(void *obj, unsigned long id, void *data,
> size_t size, gfp_t gfp_flags)
> {
> return __klp_shadow_get_or_attach(obj, id, data, size,
> gfp_flags, KLP_SHADOW_EXISTING_RETURN);
> }
>
> void *klp_shadow_get_or_attach(void *obj, unsigned long id, void *data,
> size_t size, gfp_t gfp_flags)
> {
> return __klp_shadow_get_or_attach(obj, id, data, size,
> gfp_flags, KLP_SHADOW_EXISTING_WARN);
> }
>
> void *klp_shadow_update_or_attach(void *obj, unsigned long id, void *data,
> size_t size, gfp_t gfp_flags)
> {
> return __klp_shadow_get_or_attach(obj, id, data, size,
> gfp_flags, KLP_SHADOW_EXISTING_UPDATE);
> }
>
> Note that the above code is not even compile tested.
>
> It removes a lot of cut&pasted code and the difference
> is more clear.
>
> It updates the data outside klp_shadow_lock. But it should be fine.
> The lock is there to keep the hast table consistent (avoid
> duplicates).
>
> Users are responsible to synchronize the data
> stored in the variables their own way.

^^^ this is probably the best argument to ditch
klp_shadow_update_or_attach().

>
> Hmm, the more I think about it the more I would suggest to remove
> klp_shadow_update_or_attach() variant. It has very weird logic.
> IMHO, it is not obvious that it must be called under the locks
> that synchronize the shadow data. IMHO, the other two variants
> are much more clear and enough. Or do you have a good real life
> example for it?

>From the API caller's point of view, the intent was: "I want a shadow
variable and it needs to have this current value."

klp_shadow_get_or_attach() doesn't pass back information about the
underlying shadow variable, ie, was an existing one found, or did it
allocate a new one. In the former case, the data will be stale.

I don't have a real-world example for such use-case, so I'm willing to
drop the routine if it simplifies the code. I'd be curious to see how
you might solve this situation using klp_shadow_get() and/or
klp_shadow_get_or_attach().

>> +/**
>> + * klp_shadow_add() - add a shadow variable to the hashtable
>> + * @shadow: shadow variable to add
>> + *
>> + * Callers should hold the klp_shadow_lock.
>> + */
>> +static inline void klp_shadow_add(struct klp_shadow *shadow)
>> +{
>> + hash_add_rcu(klp_shadow_hash, &shadow->node,
>> + (unsigned long)shadow->obj);
>> +}
>> +
>> +/**
>> + * klp_shadow_attach() - allocate and add a new shadow variable
>> + * @obj: pointer to parent object
>> + * @id: data identifier
>> + * @data: pointer to data to attach to parent
>> + * @size: size of attached data
>> + * @gfp_flags: GFP mask for allocation
>> + *
>> + * If an existing <obj, id> shadow variable can be found, this routine
>> + * will issue a WARN, exit early and return NULL.
>> + *
>> + * Allocates @size bytes for new shadow variable data using @gfp_flags
>> + * and copies @size bytes from @data into the new shadow variable's own
>> + * data space. If @data is NULL, @size bytes are still allocated, but
>> + * no copy is performed. The new shadow variable is then added to the
>> + * global hashtable.
>
> I would swap the two paragraphs. We should describe the main function
> first and corner cases later.

Okay, that makes sense.

>> + * Return: the shadow variable data element, NULL on duplicate or
>> + * failure.
>> + */
>> +void *klp_shadow_attach(void *obj, unsigned long id, void *data,
>> + size_t size, gfp_t gfp_flags)
>> +{
>> + struct klp_shadow *new_shadow;
>> + void *shadow_data;
>> + unsigned long flags;
>> +
>> + /* Take error exit path if <obj, id> already exists */
>> + if (unlikely(klp_shadow_get(obj, id)))
>> + goto err_exists;
>> +
>> + /* Allocate a new shadow variable for use inside the lock below */
>> + new_shadow = kzalloc(size + sizeof(*new_shadow), gfp_flags);
>
> We should print an error message when the memory cannot be allocated.
> Otherwise we will return NULL without explanation. It will be
> especially helpful when a caller forgets to check for NULL.

That's a good idea. Silent failure could be confusing.

>> + if (!new_shadow)
>> + goto err;
>> + klp_shadow_set(new_shadow, obj, id, data, size);
>> +
>> + /* Look for <obj, id> again under the lock */
>> + spin_lock_irqsave(&klp_shadow_lock, flags);
>> + shadow_data = klp_shadow_get(obj, id);
>> + if (unlikely(shadow_data)) {
>> + /*
>> + * Shadow variable was found, throw away speculative
>> + * allocation and update/return the existing one.
>> + */
>> + spin_unlock_irqrestore(&klp_shadow_lock, flags);
>> + kfree(new_shadow);
>> + goto err_exists;
>> + }
>> +
>> + /* No <obj, id> found, add the newly allocated one */
>> + klp_shadow_add(new_shadow);
>> + spin_unlock_irqrestore(&klp_shadow_lock, flags);
>> +
>> + return new_shadow->data;
>> +
>> +err_exists:
>> + WARN(1, "Duplicate shadow variable <%p, %lx>\n", obj, id);
>> +err:
>> + return NULL;
>> +}
>> +EXPORT_SYMBOL_GPL(klp_shadow_attach);
>> +
>> +/**
>> + * klp_shadow_get_or_attach() - get existing or attach a new shadow variable
>> + * @obj: pointer to parent object
>> + * @id: data identifier
>> + * @data: pointer to data to attach to parent
>> + * @size: size of attached data
>> + * @gfp_flags: GFP mask for allocation
>> + *
>> + * If an existing <obj, id> shadow variable can be found, it will be
>> + * used (but *not* updated) in the return value of this function.
>> + *
>> + * Allocates @size bytes for new shadow variable data using @gfp_flags
>> + * and copies @size bytes from @data into the new shadow variable's own
>> + * data space. If @data is NULL, @size bytes are still allocated, but
>> + * no copy is performed. The new shadow variable is then added to the
>> + * global hashtable.
>
> There is a lot of duplicated text here.

Copy/pasting -- it was a lot easier to keep the comments in sync with
the code as it was evolving through the revisions, especially with three
variants of the nearly same routine :)

> Also you need to read the first
> paragraph very carefully. Otherwise, you miss that the 2nd paragraph
> is true only in special situation.
>
> I would make it more clear, something like:
>
> It returns pointer to the shadow data when they already exist.
> Otherwise, it attaches and new shadow variable like
> klp_shadow_attach().
>
> It guarantees that only one shadow variable will exists with
> the given @id for the given @obj. Also it guarantees that
> the variable will be initialized by the given @data only when
> it did not exist before.

Good feedback, I'll incorporate something like this for the next version.

>> diff --git a/samples/livepatch/livepatch-shadow-fix1.c b/samples/livepatch/livepatch-shadow-fix1.c
>> new file mode 100644
>> index 000000000000..5acc838463d1
>> --- /dev/null
>> +++ b/samples/livepatch/livepatch-shadow-fix1.c
>> +void livepatch_fix1_dummy_free(struct dummy *d)
>> +{
>> + void **shadow_leak;
>> +
>> + /*
>> + * Patch: fetch the saved SV_LEAK shadow variable, detach and
>> + * free it. Note: handle cases where this shadow variable does
>> + * not exist (ie, dummy structures allocated before this livepatch
>> + * was loaded.)
>> + */
>> + shadow_leak = klp_shadow_get(d, SV_LEAK);
>> + if (shadow_leak) {
>> + klp_shadow_detach(d, SV_LEAK);
>> + kfree(*shadow_leak);
>
> This should get removed. The buffer used for the shadow variable
> is freed by kfree_rcu() called from klp_shadow_detach().
>
> Same problem is also in the other livepatch.
>
>> + pr_info("%s: dummy @ %p, prevented leak @ %p\n",
>> + __func__, d, *shadow_leak);
>
> This might access shadow_leak after it was (double) freed.
>
>> + } else {
>> + pr_info("%s: dummy @ %p leaked!\n", __func__, d);
>> + }
>> +
>> + kfree(d);
>> +}

Let me double check these double frees (though I have been running the
tests w/slub_debug poisoning turned on). At first glance, they look
like another holdover from the shadow-data-is-just-a pointer versions.

>
> I am sorry for the late review. I had vacation.

No worries, this is a good review!

> I know that this patch already got some acks. Most of my comments
> are about code clean up and tiny bugs that might be fixed later.
>
> But I would still suggests to re-evaluate the usefulness and
> logic of klp_shadow_update_or_attach(). I think that it was
> the primary reason for the many cut&pasted code. The locking
> logic is weird there. It does too many things at once because
> it also manipulates the existing data. All other functions just
> get pointer to the data and eventually initialize them when
> they did not exist before.

Without a good real-world example, you've convinced me that
klp_shadow_update_or_attach() could be dropped. I think this will also
simplify the requirements of a shared __klp_shadow_get_or_attach() like
you sketched out earlier.

If Josh and Miroslav don't mind, I'd like to continue churning this
patch with the suggestions that Petr has made.

-- Joe