Re: [RFC 3/3] block: use mm_huge_zero_folio in __blkdev_issue_zero_pages()
From: Pankaj Raghav
Date: Mon Jun 02 2025 - 11:48:24 EST
On 6/2/25 07:05, Christoph Hellwig wrote:
> On Tue, May 27, 2025 at 07:04:52AM +0200, Pankaj Raghav wrote:
>> Noticed a 4% increase in performance on a commercial NVMe SSD which does
>> not support OP_WRITE_ZEROES. The device's MDTS was 128K. The performance
>> gains might be bigger if the device supports bigger MDTS.
>
> Impressive gain on the one hand - on the other hand what is the macro
> workload that does a lot of zeroing on an SSD, because avoiding that
> should yield even better result while reducing wear..
>
Absolutely. I think it is better to use either WRITE_ZEROES or DISCARD. But I wanted
to have some measurable workload to show the benefits of using a huge page to zero out.
Interestingly, I have seen many client SSDs not implementing WRITE_ZEROES.
>> + unsigned int len, added = 0;
>>
>> + len = min_t(sector_t, folio_size(zero_folio),
>> + nr_sects << SECTOR_SHIFT);
>> + if (bio_add_folio(bio, zero_folio, len, 0))
>> + added = len;
>> if (added < len)
>> break;
>> nr_sects -= added >> SECTOR_SHIFT;
>
> Unless I'm missing something the added variable can go away now, and
> the code using it can simply use len.
>
Yes. This should do it.
if (!bio_add_folio(bio, zero_folio, len, 0))
break;
nr_sects -= len >> SECTOR_SHIFT;
--
Pankaj