Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

From: Peter Zijlstra
Date: Wed Jul 17 2019 - 11:09:49 EST


On Wed, Jul 17, 2019 at 10:27:30AM -0400, Alex Kogan wrote:
> > On Jul 17, 2019, at 4:39 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> > static void cna_splice_tail(struct cna_node *cn, struct cna_node *head, struct cna_node *tail)
> > {
> > struct cna_node *list;
> >
> > /* remove [head,tail] */
> > WRITE_ONCE(cn->mcs.next, tail->mcs.next);
> > tail->mcs.next = NULL;
> >
> > /* stick [head,tail] on the secondary list tail */
> > if (cn->mcs.locked <= 1) {
> > /* create secondary list */
> > head->tail = tail;
> > cn->mcs.locked = head->encoded_tail;
> > } else {
> > /* add to tail */
> > list = (struct cna_node *)decode_tail(cn->mcs.locked);
> > list->tail->next = head;
> > list->tail = tail;
> > }
> > }
> >
> > static struct cna_node *cna_find_next(struct mcs_spinlock *node)
> > {
> > struct cna_node *cni, *cn = (struct cna_node *)node;
> > struct cna_node *head, *tail = NULL;
> >
> > /* find any next lock from 'our' node */
> > for (head = cni = (struct cna_node *)READ_ONCE(cn->mcs.next);
> > cni && cni->node != cn->node;
> > tail = cni, cni = (struct cna_node *)READ_ONCE(cni->mcs.next))
> > ;
> >
> > /* when found, splice any skipped locks onto the secondary list */
> > if (cni && tail)
> > cna_splice_tail(cn, head, tail);
> >
> > return cni;
> > }
> >
> > How's that?
>
> This is almost perfect!! :)
>
> The above should work, but I think we should have a specialized fast-path for
> checking the immediate next node in the main queue. This would be the common
> case, once we splice âotherâ nodes onto the secondary queue. In the above we
> would go through four branches before returning from cna_find_next(). In the
> following we would have just one:

Right, but can you measure a difference? ;-) Anyway, no real objection,
just playing devils advocate here.

> > static struct cna_node *cna_find_next(struct mcs_spinlock *node)
> > {
> > struct cna_node *cn = (struct cna_node *)node;
> struct cna_node *cni = (struct cna_node *)READ_ONCE(cn->mcs.next);
> >
> > struct cna_node *head, *tail = NULL;
> >
> /* fast path */
> if (cni->node == cn->node)
> return cni;
>
> > /* find any next lock from 'our' node */
> > for (head = cn->mcs.next;
head = cni,

you just did that load :-)

> > cni && cni->node != cn->node;
> > tail = cni, cni = (struct cna_node *)READ_ONCE(cni->mcs.next))
> > ;
> >
> > /* when found, splice any skipped locks onto the secondary list */
> > if (cni && tail)
> > cna_splice_tail(cn, head, tail);
> >
> > return cni;
> > }
>
>
> Also, any reason you say âlockâ instead of ânodeâ in the comments?
> I.e., I think "when found, splice any skipped locks onto the secondary listâ should be
> "when found, splice any skipped nodes onto the secondary listâ.

Due to the confusion between lock waiter node and numa node :-)