RE: [PATCH 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

From: Dexuan Cui
Date: Wed Mar 22 2023 - 10:10:41 EST


> From: Michael Kelley (LINUX) <mikelley@xxxxxxxxxxxxx>
> Sent: Monday, February 27, 2023 8:42 AM
> ...
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> @@ -76,6 +76,9 @@ struct io_tlb_slot {
> static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
> static unsigned long default_nareas;
>
> +static atomic_long_t total_used = ATOMIC_LONG_INIT(0);
> +static atomic_long_t used_hiwater = ATOMIC_LONG_INIT(0);
> ...
> @@ -594,6 +597,7 @@ static int swiotlb_do_find_slots(struct device *dev, int
> area_index,
> unsigned long flags;
> unsigned int slot_base;
> unsigned int slot_index;
> + unsigned long old_hiwater, new_used;
> ...
> @@ -663,6 +667,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> area_index,
> area->index = 0;
> area->used += nslots;
> spin_unlock_irqrestore(&area->lock, flags);
> +
> + new_used = atomic_long_add_return(nslots, &total_used);
> + old_hiwater = atomic_long_read(&used_hiwater);
> + do {
> + if (new_used <= old_hiwater)
> + break;
> + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater,
> new_used));

Here 'old_hiwater' is not updated in the loop, which looks suspicious to me.
Imagine the below scenario:

Initially total_used is 0, used_hiwater is 0.

Thread A: total_used = 10; new_used = 10; old_hiwater = 0;
Thread B: total_used = 20; new_used = 20; old_hiwater = 0;

Thread A enters the 'do' loop:
used_hiwater is 0; old_hiwater is 0, so used_hiwater = 10, and we
exit from the loop.

Thread B enters the 'do' loop:
new_used is 20, old_hiwater *always* is 0, used_hiwater is 10;
because used_hiwater always doesn't equal old_hiwater,
atomic_long_try_cmpxchg() always returns 0, and we get stuck in
the loop forever.

I think the line
+ old_hiwater = atomic_long_read(&used_hiwater);
should be moved into the loop to resolve the issue.

> +static int io_tlb_hiwater_set(void *data, u64 val)
> +{
> + /* Write of any value resets high water mark to zero */

Maybe it's better if we return -EINVAL when 'val' isn't 0 ?

> + atomic_long_set(&used_hiwater, 0);
> + return 0;
> +}