Re: [PATCH v2 1/2] livepatch: introduce shadow variable API

From: Joe Lawrence
Date: Tue Jul 18 2017 - 15:36:48 EST


On Tue, Jul 18, 2017 at 03:00:47PM +0200, Petr Mladek wrote:
> Date: Tue, 18 Jul 2017 15:00:47 +0200
> From: Petr Mladek <pmladek@xxxxxxxx>
> To: Miroslav Benes <mbenes@xxxxxxx>
> Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>, Joe Lawrence
> <joe.lawrence@xxxxxxxxxx>, live-patching@xxxxxxxxxxxxxxx,
> linux-kernel@xxxxxxxxxxxxxxx, Jessica Yu <jeyu@xxxxxxxxxx>, Jiri Kosina
> <jikos@xxxxxxxxxx>
> Subject: Re: [PATCH v2 1/2] livepatch: introduce shadow variable API
> User-Agent: Mutt/1.5.21 (2010-09-15)
>
> On Mon 2017-07-17 17:35:38, Miroslav Benes wrote:
> > On Thu, 13 Jul 2017, Josh Poimboeuf wrote:
> >
> > > On Wed, Jun 28, 2017 at 11:37:26AM -0400, Joe Lawrence wrote:
> > >
> > > > +Brief API summary
> > > > +-----------------
> > > > +
> > > > +See the full API usage docbook notes in the livepatch/shadow.c
> > > > +implementation.
> > > > +
> > > > +An in-kernel hashtable references all of the shadow variables. These
> > > > +references are stored/retrieved through a <obj, num> key pair.
> > >
> > > "num" is rather vague, how about "key"?
>
> As Mirek said in the previous version. "obj" is the key for the hash
> table.
>
> Anyway, I agree that "num" is vague and even confusing. I would
> suggest to use "id".
>
> > > (And note, this and some of the other comments also apply to the code as
> > > well)
> > >
> > > > +* The klp_shadow variable data structure encapsulates both tracking
> > > > +meta-data and shadow-data:
> > > > + - meta-data
> > > > + - obj - pointer to original data
> > >
> > > Instead of "original data", how about calling it the "parent object"?
> > > That describes it better to me at least. "Original data" sounds like
> > > some of the data might be replaced.
> >
> > I agree that "original data" does not sound right. However, we use "parent
> > object" for vmlinux or a module in our code. But I don't have a better
> > name and "parent object" sounds good.
>
> What about "primary object"? I took inspiration from
> https://en.wikipedia.org/wiki/Shadow_table
>
> > > > + - num - numerical description of new data
> > >
> > > "numerical description of new data" sounds a little confusing, how about
> > > "unique identifier for new data"?
>
> Here we come to the "id" ;-)
>
> I wonder if each patch should register its own IDs and the size of the
> data. The API could shout when anyone wants to use a not yet
> registered ID or when the same ID with another size is being
> registered. It might increase security. But I am not sure
> if it is worth it.
>
>
> > > I'm also not sure about the phrase "new data". Maybe something like
> > > "new data field" would be more descriptive? Or just "new field"? I
> > > view it kind of like adding a field to a struct. Not a big deal either
> > > way.
> > >
> > > > +void *klp_shadow_attach(void *obj, unsigned long num, void *new_data,
> > > > + size_t new_size, gfp_t gfp_flags);
> > >
> > > It could be just me, but the "new_" prefixes threw me off a little bit.
> > > The new is implied anyway. How about just "data" and "size"?
> > >
> > > And the same comment for the klp_shadow struct.
> >
> > I agree with Josh on all of this.
>
> You persuaded me that "data" and "size" make sense after all ;-)
> new_obj would mean that we replace/copy the entire object.

Who knew naming things was so difficult :)

There's been a bunch of feedback on terminology, so I'll just issue a
collective reply to Petr's last msg on the topic. These were my
thoughts on naming clarification:

v1,v2 v3
--------------------------------------------------------------
obj, original data obj, parent object
num, numerical description of new data id, data identifier
new_data data
new_size data_size

Miroslav also suggested additional text explaining the id / data
identifier field. How about something like this:

---

================
Shadow Variables
================

...

A global, in-kernel hashtable associates parent pointers and a numeric
identifier with shadow variable data. Specifically, the parent pointer
serves as the hashtable key, while the numeric id further filters
hashtable queries. The numeric identifier is a simple enumeration that
may be used to describe shadow variable versions (for stacking
livepatches), class or type (for multiple shadow variables per parent),
etc. Multiple shadow variables may attach to the same parent object,
but their numeric identifier distinguises between them.

---

Thanks for the review suggestions, let me know if I missed any other
terms flagged in one of the other legs of the discussion thread.

-- Joe