Re: [PATCH v3 2/5] configfs-tsm: Introduce a shared ABI for attestation reports

From: Dan Williams
Date: Fri Sep 01 2023 - 14:47:40 EST


Thomas Gleixner wrote:
> On Thu, Aug 31 2023 at 15:13, Dan Williams wrote:
> > Dionna Amalie Glaze wrote:
> >> This is clean and approachable. Thanks for your implementation.
> >>
> >> > +static int try_advance_write_generation(struct tsm_report *report)
> >> > +{
> >> > + lockdep_assert_held_write(&tsm_rwsem);
> >> > +
> >> > + /*
> >> > + * malicious or broken userspace is attempting to wrap read_generation,
> >> > + * stop accepting updates until current report configuration is read.
> >> > + */
> >> > + if (report->write_generation == report->read_generation - 1)
> >> > + return -EBUSY;
> >> > + report->write_generation++;
> >> > + return 0;
> >> > +}
> >> > +
> >>
> >> This took me a while to wrap my head around.
> >> The property we want is that when we read outblob, it is the result of
> >> the attribute changes since the last read. If we write to an attribute
> >> 2^64 times, we could get write_generation to wrap around to equal
> >> read_generation without actually reading outblob. So we could end up
> >> given a badly cached result when we read outblob? Is that what this is
> >> preventing?
> >
> > Correct. The criticism of kernfs based interfaces like sysfs and
> > configfs is that there is no facility to atomically modify a set of
> > attributes at once. The expectated mitigation is simply that userspace
> > is well behaved. For example, 2 invocations of fdisk can confuse each
> > other, so userspace is expected to run them serially and the kernel
> > otherwise lets userspace do silly things.
> >
> > If a tool has any concern that it has exclusive ownership of the
> > interface this 'generation' attribute allows for a flow like:
> >
> > gen=$(cat $report/generation)
> > dd if=userdata > $report/inblob
> > cat $report/outblob > report
> > gen2=$(cat $report/generation)
> >
> > ...and if $gen2 is not $((gen + 1)) then tooling can detect that the
> > "userspace is well behaved" expectation was violated.
> >
> > Now configs is slightly better in this regard because objects can be
> > atomically created. So if 2 threads happen to pick the same name for the
> > report object then 'mkdir' will only succeed for one while the other
> > gets an EEXIST error. So for containers each can get their own
> > submission interface without worrying about other containers.
> >
> >> I think I would word this to say,
> >>
> >> "Malicious or broken userspace has written enough times for
> >> read_generation == write_generation by modular arithmetic without an
> >> interim read. Stop accepting updates until the current report
> >> configuration is read."
> >
> > That works for me.
>
> That's a pretty theoretical problem. Under the assumption that one
> syscall takes 50ns the wraparound on a 64bit variable will take ~29247
> years to complete.
>
> I think the important point is that the generation check there ensures
> that the expected sequence takes place and can't be overwritten by
> accident or malice, no?

Exactly. The "attack" / "bug" is hard to carry out, so this is more for
theoretical completeness than practical protection.