RE: [PATCH 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

From: Michael Kelley (LINUX)
Date: Wed Mar 22 2023 - 10:40:02 EST


From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Wednesday, March 22, 2023 7:10 AM
>
> > From: Michael Kelley (LINUX) <mikelley@xxxxxxxxxxxxx>
> > Sent: Monday, February 27, 2023 8:42 AM
> > ...
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > @@ -76,6 +76,9 @@ struct io_tlb_slot {
> > static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
> > static unsigned long default_nareas;
> >
> > +static atomic_long_t total_used = ATOMIC_LONG_INIT(0);
> > +static atomic_long_t used_hiwater = ATOMIC_LONG_INIT(0);
> > ...
> > @@ -594,6 +597,7 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > area_index,
> > unsigned long flags;
> > unsigned int slot_base;
> > unsigned int slot_index;
> > + unsigned long old_hiwater, new_used;
> > ...
> > @@ -663,6 +667,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > area_index,
> > area->index = 0;
> > area->used += nslots;
> > spin_unlock_irqrestore(&area->lock, flags);
> > +
> > + new_used = atomic_long_add_return(nslots, &total_used);
> > + old_hiwater = atomic_long_read(&used_hiwater);
> > + do {
> > + if (new_used <= old_hiwater)
> > + break;
> > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
>
> Here 'old_hiwater' is not updated in the loop, which looks suspicious to me.

Actually, it *is* updated in the loop. The atomic_long_try_cmpxchg()
function updates old_hiwater to the current value if the exchange
fails. The address of old_hiwater is passed as the argument, and
not just the value, so that it can be updated. See the documentation
for "CMPXCHG vs TRY_CMPXCHG" in Documentation/atomic_t.txt,
which includes a usage example.

> Imagine the below scenario:
>
> Initially total_used is 0, used_hiwater is 0.
>
> Thread A: total_used = 10; new_used = 10; old_hiwater = 0;
> Thread B: total_used = 20; new_used = 20; old_hiwater = 0;
>
> Thread A enters the 'do' loop:
> used_hiwater is 0; old_hiwater is 0, so used_hiwater = 10, and we
> exit from the loop.
>
> Thread B enters the 'do' loop:
> new_used is 20, old_hiwater *always* is 0, used_hiwater is 10;
> because used_hiwater always doesn't equal old_hiwater,
> atomic_long_try_cmpxchg() always returns 0, and we get stuck in
> the loop forever.
>
> I think the line
> + old_hiwater = atomic_long_read(&used_hiwater);
> should be moved into the loop to resolve the issue.
>
> > +static int io_tlb_hiwater_set(void *data, u64 val)
> > +{
> > + /* Write of any value resets high water mark to zero */
>
> Maybe it's better if we return -EINVAL when 'val' isn't 0 ?

Yes, that would probably be clearer, to prevent someone from
thinking they could reset the value to something non-zero. We
*could* allow resetting the value to something non-zero, but I
don't see a real use case for that.

Thanks for the review ....

Michael

>
> > + atomic_long_set(&used_hiwater, 0);
> > + return 0;
> > +}