Re: [PATCH v4 0/7] lib/lzo: performance improvements

From: Dave Rodgman
Date: Fri Dec 07 2018 - 10:54:15 EST


Hi Markus,

On 06/12/2018 3:47 pm, Markus F.X.J. Oberhumer wrote:> Request 3 - add lzo-rle; *NOT* acked by me
>
> [PATCH 6/8] lib/lzo: implement run-length encoding
> [PATCH 7/8] lib/lzo: separate lzo-rle from lzo
> [PATCH 8/8] zram: default to lzo-rle instead of lzo
>
> It (1) silently changes the compressed data format

I'm not sure this is relevant: as a separate algorithm, there's no reason
to retain the same format (although backwards compatibility can help with
migration). If you know of a way to improve the compatibility aspect
though, that would be great!

> (2) crashes on MIPS,

Please could you provide more detail? I tested on x86-32, x86-64, arm,
arm64 and big-endian MIPS64, but if there is an issue I missed I'd like to
address it.

> and (3) makes compression and decompression on typical data 10% slower on
> X86_64 with our internal benchmarks,

It is of course data-dependent. In my testing, as I mentioned previously, RLE
without the other patches does regress slightly on high-entropy data, but
offers a win on low-entropy data. For the right applications (e.g., zram),
this makes it overall beneficial.

> and (4) has to be carefully checked for buffer overflows.

This has been reviewed prior to sharing on LKML, and of course tested,
but further review is of course welcome.

> As a final comment, I question the quality your benchmarks - combining
> arch-related ARM64 improvements and algorithmic changes into one
> benchmark comparision is just unprofessional marketing.

I felt it was helpful to show overall performance with the complete patchset:
this is what end-users experience. However, as you can see below, I also
previously shared a summary of the two main components of the patchset to
try and address this sort of concern:

>> As a quick summary of the impact of these patches on bigger chunks of
>> data, I've compared the performance of four different variants of lzo
>> on two large (~40 MB) files. The numbers show round-trip throughput
>> in MB/s:
>>
>> Variant | Low-entropy | High-entropy
>> Current lzo | 242 | 157
>> Arm opts | 290 | 159
>> RLE | 876 | 151
>> Arm opts + RLE | 1150 | 181

cheers

Dave