Re: [RFC PATCH v2 1/2] printk-rb: add a new printk ringbuffer implementation

From: Andrea Parri
Date: Sun Jun 30 2019 - 10:09:07 EST


On Sun, Jun 30, 2019 at 04:03:34AM +0200, John Ogness wrote:
> On 2019-06-29, Andrea Parri <andrea.parri@xxxxxxxxxxxxxxxxxxxx> wrote:
> >> /**
> >> * add_descr_list() - Add a descriptor to the descriptor list.
> >> *
> >> * @e: An entry that has already reserved data.
> >> *
> >> * The provided entry contains a pointer to a descriptor that has already
> >> * been reserved for this entry. However, the reserved descriptor is not
> >> * yet on the list. Add this descriptor as the newest item.
> >> *
> >> * A descriptor is added in two steps. The first step is to make this
> >> * descriptor the newest. The second step is to update @next of the former
> >> * newest descriptor to point to this one (or set @oldest to this one if
> >> * this will be the first descriptor on the list).
> >> */
> >> static void add_descr_list(struct prb_reserved_entry *e)
> >> {
> >> struct printk_ringbuffer *rb = e->rb;
> >> struct prb_list *l = &rb->descr_list;
> >> struct prb_descr *d = e->descr;
> >> struct prb_descr *newest_d;
> >> unsigned long newest_id;
> >>
> >> WRITE_ONCE(d->next, EOL);
> >
> > /* C */
> >
> >
> >>
> >> do {
> >> newest_id = READ_ONCE(l->newest);
> >
> > /* A */
> >
> >
> >> newest_d = TO_DESC(rb, newest_id);
> >>
> >> if (newest_id == EOL) {
> >> WRITE_ONCE(d->seq, 1);
> >> } else {
> >> /*
> >> * MB5-read: synchronize setting newest descr
> >> *
> >> * context-pair: 2 writers adding a descriptor via
> >> * add_descr_list().
> >> *
> >> * @newest will load before @seq due to a data
> >> * dependency, therefore, the stores of @seq
> >> * and @next from the pairing MB5-write context
> >> * will be visible.
> >> *
> >> * Although @next is not loaded by this context,
> >> * this context must overwrite the stored @next
> >> * value of the pairing MB5-write context.
> >> */
> >> WRITE_ONCE(d->seq, READ_ONCE(newest_d->seq) + 1);
> >
> > /* B: this READ_ONCE() */
> >
> > Hence you're claiming a data dependency from A to B. (FWIW, the LKMM
> > would call "A ->dep B" an "address dependency.)
> >
> > This comment also claims that the "pairing MB5-write" orders "stores
> > of @seq and @next" (which are to different memory locations w.r.t. A
> > and B): I do not get why this access to @next (C above?, that's also
> > "unordered" w.r.t. A) can be relevant; can you elaborate?
>
> I will add some more labels to complete the picture. All these events
> are within this function:
>
> D: the WRITE_ONCE() to @seq
>
> E: the STORE of a successful cmpxchg() for @newest (the MB5-write
> cmpxchg())
>
> F: the STORE of a new @next (the last smp_store_release() of this
> function, note that the _release() is not relevant for this pair)
>
> The significant events for 2 contexts that are accessing the same
> addresses of a descriptor are:
>
> P0(struct desc *d0)
> {
> // adding a new descriptor d0
>
> WRITE_ONCE(d0->next, EOL); // C
> WRITE_ONCE(d0->seq, X); // D
> cmpxchg_release(newest, Y, indexof(d0)); // E
> }
>
> P1(struct desc *d1)
> {
> // adding a new descriptor d1 that comes after d0
>
> struct desc *d0;
> int r0, r1;
>
> r0 = READ_ONCE(newest); // A
> d0 = &array[r0];
> r1 = READ_ONCE(d0->seq); // B
> WRITE_ONCE(d0->next, Z); // F
> }
>
> d0 is the same address for P0 and P1. (The values of EOL, X, Y, Z are
> unrelated and irrelevant.)
>
> I am claiming that:
>
> - B comes after D
> - F comes after C

I think these are both assuming that A is reading the value stored by E
(shortly, "A reads from E")? If so, then the two claims become/are:

- If A reads from E, then B comes after D

- If A reads from E, then F comes after C

I think you could avoid the (ambiguous) "comes after" and say something
like:

(1) If A reads from E, then B reads from D (or from another store
to ->seq, not reported in the snippet, which overwrites D)

(2) If A reads from E, then F overwrites C

This, IIUC, for the informal descriptions of the (intended) guarantees.
Back to the pairings in question: AFAICT,

(a) For (1), we rely on the pairing:

RELEASE from D to E (matching) ADDRESS DEP. from A to B

(b) For (2), we rely on the pairing:

RELEASE from C to E (matching) ADDRESS DEP. from A to F

Does this make sense?


>
> >> }
> >>
> >> /*
> >> * MB5-write: synchronize setting newest descr
> >> *
> >> * context-pair: 2 writers adding a descriptor via
> >> * add_descr_list().
> >> *
> >> * Ensure that @next and @seq are stored before @d is
> >> * visible via @newest. The pairing MB5-read context
> >> * must load this @seq value and must overwrite this
> >> * @next value.
> >> */
> >> } while (cmpxchg_release(&l->newest, newest_id, e->id) != newest_id);
> >>
> >> if (unlikely(newest_id == EOL)) {
> >> /*
> >> * MB0-write: synchronize adding first descr
> >> *
> >> * context-pair: 1 writer adding the first descriptor via
> >> * add_descr_list(), 1 reader getting the beginning of
> >> * the list via iter_peek_next_id().
> >> *
> >> * This context recently assigned new values for @id,
> >> * @next, @seq. Ensure these are stored before the first
> >> * store to @oldest so that the new values are visible
> >> * to the reader in the pairing MB0-read context.
> >> *
> >> * Note: Before this store, the value of @oldest is EOL.
> >> */
> >
> > My gmail-search foo is unable to locate MB0-read: what am I missing?
> > Also, can you maybe annotate the memory accesses to @id, @next, @seq
> > and @oldest (as I did above)? I find myself guessing their location.
>
> Sorry. The MB0-read is a _new_ comment that would be added to the
> smp_rmb() of the reader functions. I didn't repost everything because I
> just wanted to get a feel if the comments for _this_ function are
> improving. Really all I care about right now is properly documenting
> MB5. It is a good example because MB5 is completely within this
> function. If I can satisfactorily document MB5, then I can post a new
> version with updated comments for everything.

Oh, I see, thanks for this clarification.


>
> >> smp_store_release(&l->oldest, e->id);
> >> } else {
> >> /*
> >> * MB6-write: synchronize linking new descr
> >> *
> >> * context-pair-1: 1 writer adding a descriptor via
> >> * add_descr_list(), 1 writer removing a descriptor via
> >> * remove_oldest_descr().
> >> *
> >> * If this is a recycled descriptor, this context
> >> * recently stored a new @oldest value. Ensure that
> >> * @oldest is stored before storing @next so that
> >> * if the pairing MB6-read context sees a non-EOL
> >> * @next value, it is ensured that it will also see
> >> * an updated @oldest value.
> >> *
> >> * context-pair-2: 1 writer adding a descriptor via
> >> * add_descr_list(), 1 reader iterating the list via
> >> * prb_iter_next_valid_entry().
> >> *
> >> * This context recently assigned new values for @id,
> >> * @next, @seq, @data, @data_next. Ensure these are
> >> * stored before storing @next of the previously
> >> * newest descriptor so that the new values are
> >> * visible to the iterating reader in the pairing
> >> * MB6-read context.
> >> *
> >> * Note: Before this store, the value of @next of the
> >> * previously newest descriptor is EOL.
> >> */
> >
> > Same as above but for MB6-read and the accesses to @id, @next, @seq,
> > @data, @data_next.
> >
> > In conclusion, I have been unable to produce litmus tests by reading
> > your comments (meaning I'm lost).
>
> I feel like I'm stating all the information, but nobody understands it.
> If you can help me to correctly document MB5, I can submit a new version
> with all the memory barriers correctly documented.

IMO (and assuming that what I wrote above makes some sense), (a-b) and
(1-2) above, together with the associated annotations of the code/ops,
provide all the desired and necessary information to document MB5.

For readability purposes, it could be nice to also keep the snippet you
provided above (but let me stress, again, that such a snippet should be
integrated with additional information as suggested above).

As to "where to insert the memory barrier documentation", I really have
no suggestion ATM. I guess someone would split it (say, before A and E)
while others could prefer to keep it within a same inline comment.

Thanks for this information (and for your patience!),

Andrea