Re: [RFC PATCH v4] IV Generation algorithms for dm-crypt

From: Gilad Ben-Yossef
Date: Mon Mar 06 2017 - 09:39:21 EST


On Wed, Mar 1, 2017 at 5:38 PM, Milan Broz <gmazyland@xxxxxxxxx> wrote:
>
> On 03/01/2017 02:04 PM, Milan Broz wrote:
>> On 03/01/2017 01:42 PM, Gilad Ben-Yossef wrote:
>> ...
>>
>>> I can certainly understand if you don't wont to take the patch until
>>> we have results with
>>> dm-crypt itself but the difference between 8 separate invocation of
>>> the engine for 512
>>> bytes of XTS and a single invocation for 4KB are pretty big.
>>
>> Yes, I know it. But the same can be achieved if we just implement
>> 4k sector encryption in dmcrypt. It is incompatible with LUKS1
>> (but next LUKS version will support it) but I think this is not
>> a problem for now.
>>
>> If the underlying device supports atomic write of 4k sectors, then
>> there should not be a problem.
>>
>> This is one of the speed-up I would like to compare with the IV approach,
>> because everyone should benefit from 4k sectors in the end.
>> And no crypto API changes are needed here.
>>
>> (I have an old patch for this, so I will try to revive it.)
>
> If anyone interested, simple experimental patch for larger sector size
> (up to the page size) for dmcrypt is in this branch:
>
> http://git.kernel.org/cgit/linux/kernel/git/mbroz/linux.git/log/?h=dm-crypt-4k-sector
>
> It would be nice to check what performance gain could be provided
> by this simple approach.


I gave it a spin on a x86_64 with 8 CPUs with AES-NI using cryptd and
on Arm using CryptoCell hardware accelerator.

There was no difference in performance between 512 and 4096 bytes
cluster size on the x86_64 (800 MB loop file system)

There was an improvement in latency of 3.2% between 512 and 4096 bytes
cluster size on the Arm. I expect the performance benefits for this
test for Binoy's patch to be the same.

In both cases the very naive test was a simple dd with block size of
4096 bytes or the raw block device.

I do not know what effect having a bigger cluster size would have on
have on other more complex file system operations.
Is there any specific benchmark worth testing with?


Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
-- Jean-Baptiste Queru