Re: [PATCH v2 2/2] scsi: ufs: core: move some irq handling back to hardirq (with time limit)

From: André Draszik
Date: Mon Jul 28 2025 - 10:49:25 EST


On Mon, 2025-07-28 at 16:43 +0200, Neil Armstrong wrote:
> On 25/07/2025 16:16, André Draszik wrote:
> > Commit 3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service
> > routine to a threaded IRQ handler") introduced a massive performance
> > drop for various work loads on UFSHC versions < 4 due to the extra
> > latency introduced by moving all of the IRQ handling into a threaded
> > handler. See below for a summary.
> >
> > To resolve this performance drop, move IRQ handling back into hardirq
> > context, but apply a time limit which, once expired, will cause the
> > remainder of the work to be deferred to the threaded handler.
> >
> > Above commit is trying to avoid unduly delay of other subsystem
> > interrupts while the UFS events are being handled. By limiting the
> > amount of time spent in hardirq context, we can still ensure that.
> >
> > The time limit itself was chosen because I have generally seen
> > interrupt handling to have been completed within 20 usecs, with the
> > occasional spikes of a couple 100 usecs.
> >
> > This commits brings UFS performance roughly back to original
> > performance, and should still avoid other subsystem's starvation thanks
> > to dealing with these spikes.
> >
> > fio results for 4k block size on Pixel 6, all values being the average
> > of 5 runs each:
> >    read / 1 job      original      after  this commit
> >      min IOPS        4,653.60   2,704.40     3,902.80
> >      max IOPS        6,151.80   4,847.60     6,103.40
> >      avg IOPS        5,488.82   4,226.61     5,314.89
> >      cpu % usr           1.85       1.72         1.97
> >      cpu % sys          32.46      28.88        33.29
> >      bw MB/s            21.46      16.50        20.76
> >
> >    read / 8 jobs     original      after  this commit
> >      min IOPS       18,207.80  11,323.00    17,911.80
> >      max IOPS       25,535.80  14,477.40    24,373.60
> >      avg IOPS       22,529.93  13,325.59    21,868.85
> >      cpu % usr           1.70       1.41         1.67
> >      cpu % sys          27.89      21.85        27.23
> >      bw MB/s            88.10      52.10        84.48
> >
> >    write / 1 job     original      after  this commit
> >      min IOPS        6,524.20   3,136.00     5,988.40
> >      max IOPS        7,303.60   5,144.40     7,232.40
> >      avg IOPS        7,169.80   4,608.29     7,014.66
> >      cpu % usr           2.29       2.34         2.23
> >      cpu % sys          41.91      39.34        42.48
> >      bw MB/s            28.02      18.00        27.42
> >
> >    write / 8 jobs    original      after  this commit
> >      min IOPS       12,685.40  13,783.00    12,622.40
> >      max IOPS       30,814.20  22,122.00    29,636.00
> >      avg IOPS       21,539.04  18,552.63    21,134.65
> >      cpu % usr           2.08       1.61         2.07
> >      cpu % sys          30.86      23.88        30.64
> >      bw MB/s            84.18      72.54        82.62
>
> Thanks for this updated change, I'm running the exact same run on SM8650 to check the impact,
> and I'll report something comparable.

Btw, my complete command was (should probably have added that
to the commit message in the first place):

for rw in read write ; do
echo "rw: ${rw}"
for jobs in 1 8 ; do
echo "jobs: ${jobs}"
for it in $(seq 1 5) ; do
fio --name=rand${rw} --rw=rand${rw} \
--ioengine=libaio --direct=1 \
--bs=4k --numjobs=${jobs} --size=32m \
--runtime=30 --time_based --end_fsync=1 \
--group_reporting --filename=/foo \
| grep -E '(iops|sys=|READ:|WRITE:)'
sleep 5
done
done
done

Cheers,
Andre'