Re: [PATCH 2/4] crypto: qat - Remove zlib-deflate

From: Gao Xiang
Date: Tue Sep 05 2023 - 22:51:56 EST




On 2023/9/6 00:23, David Sterba wrote:
On Tue, Sep 05, 2023 at 02:15:44PM +0100, Giovanni Cabiddu wrote:
Hi Herbert,

On Wed, Aug 30, 2023 at 06:08:47PM +0800, Herbert Xu wrote:
Remove the implementation of zlib-deflate because it is completely
unused in the kernel.
We are working at a new revision of [1] which enables BTRFS to use acomp
for offloading zlib-deflate. We see that there is value in using QAT for
such use case in terms of throughput / CPU utilization / compression ratio
compared to software.
Zlib-deflate is preferred to deflate since BTRFS already uses that
format.

We expect to send this patch for 6.7.
Can we keep zlib-deflate in the kernel?

Thanks,

[1] https://patchwork.kernel.org/project/linux-btrfs/patch/1467083180-111750-1-git-send-email-weigang.li@xxxxxxxxx/

The patch is from 2016 and zlib though still supported has been
superseded by zstd that is from 2017. It would be good to see numbers
comparing zlib (cpu), zlib (qat) against relevant zstd levels. The
offloading might be an improvement and worth adding the support
otherwise I don't see much reason to add it unless there are users.

I can see there's QAT support for zstd too,
https://github.com/intel/QAT-ZSTD-Plugin, can't find one for lzo but in
case ther's QAT for all 3 algorithms used by btrfs I wouldn't mind
keeping the QAT support for zlib for parity.

Just my personal side note: from my own point of view, QAT actually only
has DEFLATE and LZ4 format hardware end-to-end (de)compression support [1]
(IAA actually has DEFLATE-family format only [2]).

They partially support compressing Zstd format with their internal
hardware lz4s + postprocessing pipeline (mostly a LZ77 matchfinder) by
using hardware (with hw_buffer_sz less than 128KiB. [3][4]) and Zstd
hardware decompression is currently not supported since there is no
such hardware format support.

I'm not saying new Zstd algorithm is not amazing. Yet from the hardware
perspective, I guess that due to gzip, the original zip, png, pdf, docx,
https, even pppoe all based on DEFLATE (you could see a lot other OSes
support DEFLATE-family format), so it's reasonble to resolve such
de-facto standard with limited hardware first to boost up data center
use cases. If they have abundant hardware chip room, I guess they will
consider Zstd as well.

As for zlib container format, I don't see zlib Adler-32 checksum is
useful since almost all hardware accelerator supports raw DEFLATE but
Adler-32 checksum is uncommon. If we consider better integrating
support, we should consider a more common hash (instead Adler-32
checksum) of or by using merkle tree to build the trust chain.

Actually we're working on EROFS raw deflate to enable IAA accelerator
support so we also hope IAA patchset could be landed upstream [5] so
I could upstream my work then. Therefore, I hope "hisilicon ZIP
driver" could support raw deflate too (after a quick search, their
decompression spend is 1530 MB/s compared with the original zlib
219 MB/s on their platform [6])

[1] https://github.com/intel/QATzip
[2] https://cdrdv2.intel.com/v1/dl/getContent/721858
[3] https://github.com/facebook/zstd/issues/455#issuecomment-1397204587
[4] https://github.com/intel/QATzip/blob/master/utils/qzstd.c#L211
[5] https://lore.kernel.org/r/20230807203726.1682123-1-tom.zanussi@xxxxxxxxxxxxxxx
[6] https://compare-intel-kunpeng.readthedocs.io/zh_CN/latest/accelerator.html

Thanks,
Gao Xiang