Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver

From: liulongfang
Date: Mon Aug 08 2022 - 07:49:31 EST


On 2022/8/8 17:30, Neal Liu wrote:
>> -----Original Message-----
>> From: liulongfang <liulongfang@xxxxxxxxxx>
>> Sent: Monday, August 8, 2022 10:53 AM
>> To: Neal Liu <neal_liu@xxxxxxxxxxxxxx>; Corentin Labbe
>> <clabbe.montjoie@xxxxxxxxx>; Christophe JAILLET
>> <christophe.jaillet@xxxxxxxxxx>; Randy Dunlap <rdunlap@xxxxxxxxxxxxx>;
>> Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>; David S . Miller
>> <davem@xxxxxxxxxxxxx>; Rob Herring <robh+dt@xxxxxxxxxx>; Krzysztof
>> Kozlowski <krzysztof.kozlowski+dt@xxxxxxxxxx>; Joel Stanley <joel@xxxxxxxxx>;
>> Andrew Jeffery <andrew@xxxxxxxx>; Dhananjay Phadke
>> <dhphadke@xxxxxxxxxxxxx>; Johnny Huang
>> <johnny_huang@xxxxxxxxxxxxxx>
>> Cc: linux-aspeed@xxxxxxxxxxxxxxxx; linux-crypto@xxxxxxxxxxxxxxx;
>> devicetree@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
>> linux-kernel@xxxxxxxxxxxxxxx; BMC-SW <BMC-SW@xxxxxxxxxxxxxx>
>> Subject: Re: [PATCH v8 1/5] crypto: aspeed: Add HACE hash driver
>>
>>
>> On 2022/7/26 19:34, Neal Liu wrote:
>>> Hash and Crypto Engine (HACE) is designed to accelerate the
>>> throughput of hash data digest, encryption, and decryption.
>>>
>>> Basically, HACE can be divided into two independently engines
>>> - Hash Engine and Crypto Engine. This patch aims to add HACE
>>> hash engine driver for hash accelerator.
>>>
>>> Signed-off-by: Neal Liu <neal_liu@xxxxxxxxxxxxxx>
>>> Signed-off-by: Johnny Huang <johnny_huang@xxxxxxxxxxxxxx>
>>> ---
>>> MAINTAINERS | 7 +
>>> drivers/crypto/Kconfig | 1 +
>>> drivers/crypto/Makefile | 1 +
>>> drivers/crypto/aspeed/Kconfig | 32 +
>>> drivers/crypto/aspeed/Makefile | 6 +
>>> drivers/crypto/aspeed/aspeed-hace-hash.c | 1389
>> ++++++++++++++++++++++
>>> drivers/crypto/aspeed/aspeed-hace.c | 213 ++++
>>> drivers/crypto/aspeed/aspeed-hace.h | 186 +++
>>> 8 files changed, 1835 insertions(+)
>>> create mode 100644 drivers/crypto/aspeed/Kconfig
>>> create mode 100644 drivers/crypto/aspeed/Makefile
>>> create mode 100644 drivers/crypto/aspeed/aspeed-hace-hash.c
>>> create mode 100644 drivers/crypto/aspeed/aspeed-hace.c
>>> create mode 100644 drivers/crypto/aspeed/aspeed-hace.h
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index f55aea311af5..23a0215b7e42 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -3140,6 +3140,13 @@ S: Maintained
>>> F: Documentation/devicetree/bindings/media/aspeed-video.txt
>>> F: drivers/media/platform/aspeed/
>>>
>>> +ASPEED CRYPTO DRIVER
>>> +M: Neal Liu <neal_liu@xxxxxxxxxxxxxx>
>>> +L: linux-aspeed@xxxxxxxxxxxxxxxx (moderated for non-subscribers)
>>> +S: Maintained
>>> +F:
>> Documentation/devicetree/bindings/crypto/aspeed,ast2500-hace.yaml
>>> +F: drivers/crypto/aspeed/
>>> +
>>> ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS
>>> M: Corentin Chary <corentin.chary@xxxxxxxxx>
>>> L: acpi4asus-user@xxxxxxxxxxxxxxxxxxxxx
>>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
>>> index ee99c02c84e8..b9f5ee126881 100644
>>> --- a/drivers/crypto/Kconfig
>>> +++ b/drivers/crypto/Kconfig
>>> @@ -933,5 +933,6 @@ config CRYPTO_DEV_SA2UL
>>> acceleration for cryptographic algorithms on these devices.
>>>
>>> source "drivers/crypto/keembay/Kconfig"
>>> +source "drivers/crypto/aspeed/Kconfig"
>>>
>>> endif # CRYPTO_HW
>>> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
>>> index f81703a86b98..116de173a66c 100644
>>> --- a/drivers/crypto/Makefile
>>> +++ b/drivers/crypto/Makefile
>>> @@ -1,5 +1,6 @@
>>> # SPDX-License-Identifier: GPL-2.0
>>> obj-$(CONFIG_CRYPTO_DEV_ALLWINNER) += allwinner/
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed/
>>> obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
>>> obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
>>> obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
>>> diff --git a/drivers/crypto/aspeed/Kconfig b/drivers/crypto/aspeed/Kconfig
>>> new file mode 100644
>>> index 000000000000..059e627efef8
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/Kconfig
>>> @@ -0,0 +1,32 @@
>>> +config CRYPTO_DEV_ASPEED
>>> + tristate "Support for Aspeed cryptographic engine driver"
>>> + depends on ARCH_ASPEED
>>> + help
>>> + Hash and Crypto Engine (HACE) is designed to accelerate the
>>> + throughput of hash data digest, encryption and decryption.
>>> +
>>> + Select y here to have support for the cryptographic driver
>>> + available on Aspeed SoC.
>>> +
>>> +config CRYPTO_DEV_ASPEED_HACE_HASH
>>> + bool "Enable Aspeed Hash & Crypto Engine (HACE) hash"
>>> + depends on CRYPTO_DEV_ASPEED
>>> + select CRYPTO_ENGINE
>>> + select CRYPTO_SHA1
>>> + select CRYPTO_SHA256
>>> + select CRYPTO_SHA512
>>> + select CRYPTO_HMAC
>>> + help
>>> + Select here to enable Aspeed Hash & Crypto Engine (HACE)
>>> + hash driver.
>>> + Supports multiple message digest standards, including
>>> + SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, and so on.
>>> +
>>> +config CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>> + bool "Enable HACE hash debug messages"
>>> + depends on CRYPTO_DEV_ASPEED_HACE_HASH
>>> + help
>>> + Print HACE hash debugging messages if you use this option
>>> + to ask for those messages.
>>> + Avoid enabling this option for production build to
>>> + minimize driver timing.
>>> diff --git a/drivers/crypto/aspeed/Makefile
>> b/drivers/crypto/aspeed/Makefile
>>> new file mode 100644
>>> index 000000000000..8bc8d4fed5a9
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/Makefile
>>> @@ -0,0 +1,6 @@
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED) += aspeed_crypto.o
>>> +aspeed_crypto-objs := aspeed-hace.o \
>>> + $(hace-hash-y)
>>> +
>>> +obj-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) += aspeed-hace-hash.o
>>> +hace-hash-$(CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH) :=
>> aspeed-hace-hash.o
>>> diff --git a/drivers/crypto/aspeed/aspeed-hace-hash.c
>> b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>> new file mode 100644
>>> index 000000000000..63a8ad694996
>>> --- /dev/null
>>> +++ b/drivers/crypto/aspeed/aspeed-hace-hash.c
>>> @@ -0,0 +1,1389 @@
>>> +// SPDX-License-Identifier: GPL-2.0+
>>> +/*
>>> + * Copyright (c) 2021 Aspeed Technology Inc.
>>> + */
>>> +
>>> +#include "aspeed-hace.h"
>>> +
>>> +#ifdef CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH_DEBUG
>>> +#define AHASH_DBG(h, fmt, ...) \
>>> + dev_info((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>>> +#else
>>> +#define AHASH_DBG(h, fmt, ...) \
>>> + dev_dbg((h)->dev, "%s() " fmt, __func__, ##__VA_ARGS__)
>>> +#endif
>>> +
>>> +/* Initialization Vectors for SHA-family */
>>> +static const __be32 sha1_iv[8] = {
>>> + cpu_to_be32(SHA1_H0), cpu_to_be32(SHA1_H1),
>>> + cpu_to_be32(SHA1_H2), cpu_to_be32(SHA1_H3),
>>> + cpu_to_be32(SHA1_H4), 0, 0, 0
>>> +};
>>> +
>>> +static const __be32 sha224_iv[8] = {
>>> + cpu_to_be32(SHA224_H0), cpu_to_be32(SHA224_H1),
>>> + cpu_to_be32(SHA224_H2), cpu_to_be32(SHA224_H3),
>>> + cpu_to_be32(SHA224_H4), cpu_to_be32(SHA224_H5),
>>> + cpu_to_be32(SHA224_H6), cpu_to_be32(SHA224_H7),
>>> +};
>>> +
>>> +static const __be32 sha256_iv[8] = {
>>> + cpu_to_be32(SHA256_H0), cpu_to_be32(SHA256_H1),
>>> + cpu_to_be32(SHA256_H2), cpu_to_be32(SHA256_H3),
>>> + cpu_to_be32(SHA256_H4), cpu_to_be32(SHA256_H5),
>>> + cpu_to_be32(SHA256_H6), cpu_to_be32(SHA256_H7),
>>> +};
>>> +
>>> +static const __be64 sha384_iv[8] = {
>>> + cpu_to_be64(SHA384_H0), cpu_to_be64(SHA384_H1),
>>> + cpu_to_be64(SHA384_H2), cpu_to_be64(SHA384_H3),
>>> + cpu_to_be64(SHA384_H4), cpu_to_be64(SHA384_H5),
>>> + cpu_to_be64(SHA384_H6), cpu_to_be64(SHA384_H7)
>>> +};
>>> +
>>> +static const __be64 sha512_iv[8] = {
>>> + cpu_to_be64(SHA512_H0), cpu_to_be64(SHA512_H1),
>>> + cpu_to_be64(SHA512_H2), cpu_to_be64(SHA512_H3),
>>> + cpu_to_be64(SHA512_H4), cpu_to_be64(SHA512_H5),
>>> + cpu_to_be64(SHA512_H6), cpu_to_be64(SHA512_H7)
>>> +};
>>> +
>>> +static const __be32 sha512_224_iv[16] = {
>>> + cpu_to_be32(0xC8373D8CUL), cpu_to_be32(0xA24D5419UL),
>>> + cpu_to_be32(0x6699E173UL), cpu_to_be32(0xD6D4DC89UL),
>>> + cpu_to_be32(0xAEB7FA1DUL), cpu_to_be32(0x829CFF32UL),
>>> + cpu_to_be32(0x14D59D67UL), cpu_to_be32(0xCF9F2F58UL),
>>> + cpu_to_be32(0x692B6D0FUL), cpu_to_be32(0xA84DD47BUL),
>>> + cpu_to_be32(0x736FE377UL), cpu_to_be32(0x4289C404UL),
>>> + cpu_to_be32(0xA8859D3FUL), cpu_to_be32(0xC8361D6AUL),
>>> + cpu_to_be32(0xADE61211UL), cpu_to_be32(0xA192D691UL)
>>> +};
>>> +
>>> +static const __be32 sha512_256_iv[16] = {
>>> + cpu_to_be32(0x94213122UL), cpu_to_be32(0x2CF72BFCUL),
>>> + cpu_to_be32(0xA35F559FUL), cpu_to_be32(0xC2644CC8UL),
>>> + cpu_to_be32(0x6BB89323UL), cpu_to_be32(0x51B1536FUL),
>>> + cpu_to_be32(0x19773896UL), cpu_to_be32(0xBDEA4059UL),
>>> + cpu_to_be32(0xE23E2896UL), cpu_to_be32(0xE3FF8EA8UL),
>>> + cpu_to_be32(0x251E5EBEUL), cpu_to_be32(0x92398653UL),
>>> + cpu_to_be32(0xFC99012BUL), cpu_to_be32(0xAAB8852CUL),
>>> + cpu_to_be32(0xDC2DB70EUL), cpu_to_be32(0xA22CC581UL)
>>> +};
>>> +
>>> +/* The purpose of this padding is to ensure that the padded message is a
>>> + * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits
>> (SHA384/SHA512).
>>> + * The bit "1" is appended at the end of the message followed by
>>> + * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
>>> + * 128 bits block (SHA384/SHA512) equals to the message length in bits
>>> + * is appended.
>>> + *
>>> + * For SHA1/SHA224/SHA256, padlen is calculated as followed:
>>> + * - if message length < 56 bytes then padlen = 56 - message length
>>> + * - else padlen = 64 + 56 - message length
>>> + *
>>> + * For SHA384/SHA512, padlen is calculated as followed:
>>> + * - if message length < 112 bytes then padlen = 112 - message length
>>> + * - else padlen = 128 + 112 - message length
>>> + */
>>> +static void aspeed_ahash_fill_padding(struct aspeed_hace_dev *hace_dev,
>>> + struct aspeed_sham_reqctx *rctx)
>>> +{
>>> + unsigned int index, padlen;
>>> + __be64 bits[2];
>>> +
>>> + AHASH_DBG(hace_dev, "rctx flags:0x%x\n", (u32)rctx->flags);
>>> +
>>> + switch (rctx->flags & SHA_FLAGS_MASK) {
>>> + case SHA_FLAGS_SHA1:
>>> + case SHA_FLAGS_SHA224:
>>> + case SHA_FLAGS_SHA256:
>>> + bits[0] = cpu_to_be64(rctx->digcnt[0] << 3);
>>> + index = rctx->bufcnt & 0x3f;
>>> + padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
>>> + *(rctx->buffer + rctx->bufcnt) = 0x80;
>>> + memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
>>> + memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 8);
>>> + rctx->bufcnt += padlen + 8;
>>> + break;
>>> + default:
>>> + bits[1] = cpu_to_be64(rctx->digcnt[0] << 3);
>>> + bits[0] = cpu_to_be64(rctx->digcnt[1] << 3 |
>>> + rctx->digcnt[0] >> 61);
>>> + index = rctx->bufcnt & 0x7f;
>>> + padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
>>> + *(rctx->buffer + rctx->bufcnt) = 0x80;
>>> + memset(rctx->buffer + rctx->bufcnt + 1, 0, padlen - 1);
>>> + memcpy(rctx->buffer + rctx->bufcnt + padlen, bits, 16);
>>> + rctx->bufcnt += padlen + 16;
>>> + break;
>>> + }
>>> +}
>>> +
>>> +/*
>>> + * Prepare DMA buffer before hardware engine
>>> + * processing.
>>> + */
>>> +static int aspeed_ahash_dma_prepare(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> + struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> + struct ahash_request *req = hash_engine->req;
>>> + struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> + int length, remain;
>>> +
>>> + length = rctx->total + rctx->bufcnt;
>>> + remain = length % rctx->block_size;
>>> +
>>> + AHASH_DBG(hace_dev, "length:0x%x, remain:0x%x\n", length, remain);
>>> +
>>> + if (rctx->bufcnt)
>>> + memcpy(hash_engine->ahash_src_addr, rctx->buffer, rctx->bufcnt);
>>> +
>>> + if (rctx->total + rctx->bufcnt < ASPEED_CRYPTO_SRC_DMA_BUF_LEN) {
>>> + scatterwalk_map_and_copy(hash_engine->ahash_src_addr +
>>> + rctx->bufcnt, rctx->src_sg,
>>> + rctx->offset, rctx->total - remain, 0);
>>> + rctx->offset += rctx->total - remain;
>>> +
>>> + } else {
>>> + dev_warn(hace_dev->dev, "Hash data length is too large\n");
>>> + return -EINVAL;
>>> + }
>>> +
>>> + scatterwalk_map_and_copy(rctx->buffer, rctx->src_sg,
>>> + rctx->offset, remain, 0);
>>> +
>>> + rctx->bufcnt = remain;
>>> + rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
>>> + SHA512_DIGEST_SIZE,
>>> + DMA_BIDIRECTIONAL);
>>> + if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>> + dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>> + return -ENOMEM;
>>> + }
>>> +
>>> + hash_engine->src_length = length - remain;
>>> + hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>> + hash_engine->digest_dma = rctx->digest_dma_addr;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +/*
>>> + * Prepare DMA buffer as SG list buffer before
>>> + * hardware engine processing.
>>> + */
>>> +static int aspeed_ahash_dma_prepare_sg(struct aspeed_hace_dev
>> *hace_dev)
>>> +{
>>> + struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> + struct ahash_request *req = hash_engine->req;
>>> + struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> + struct aspeed_sg_list *src_list;
>>> + struct scatterlist *s;
>>> + int length, remain, sg_len, i;
>>> + int rc = 0;
>>> +
>>> + remain = (rctx->total + rctx->bufcnt) % rctx->block_size;
>>> + length = rctx->total + rctx->bufcnt - remain;
>>> +
>>> + AHASH_DBG(hace_dev, "%s:0x%x, %s:0x%x, %s:0x%x, %s:0x%x\n",
>>> + "rctx total", rctx->total, "bufcnt", rctx->bufcnt,
>>> + "length", length, "remain", remain);
>>> +
>>> + sg_len = dma_map_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>> + DMA_TO_DEVICE);
>>> + if (!sg_len) {
>>> + dev_warn(hace_dev->dev, "dma_map_sg() src error\n");
>>> + rc = -ENOMEM;
>>> + goto end;
>>> + }
>>> +
>>> + src_list = (struct aspeed_sg_list *)hash_engine->ahash_src_addr;
>>> + rctx->digest_dma_addr = dma_map_single(hace_dev->dev, rctx->digest,
>>> + SHA512_DIGEST_SIZE,
>>> + DMA_BIDIRECTIONAL);
>>> + if (dma_mapping_error(hace_dev->dev, rctx->digest_dma_addr)) {
>>> + dev_warn(hace_dev->dev, "dma_map() rctx digest error\n");
>>> + rc = -ENOMEM;
>>> + goto free_src_sg;
>>> + }
>>> +
>>> + if (rctx->bufcnt != 0) {
>>> + rctx->buffer_dma_addr = dma_map_single(hace_dev->dev,
>>> + rctx->buffer,
>>> + rctx->block_size * 2,
>>> + DMA_TO_DEVICE);
>>> + if (dma_mapping_error(hace_dev->dev, rctx->buffer_dma_addr)) {
>>> + dev_warn(hace_dev->dev, "dma_map() rctx buffer error\n");
>>> + rc = -ENOMEM;
>>> + goto free_rctx_digest;
>>> + }
>>> +
>>> + src_list[0].phy_addr = rctx->buffer_dma_addr;
>>> + src_list[0].len = rctx->bufcnt;
>>> + length -= src_list[0].len;
>>> +
>>> + /* Last sg list */
>>> + if (length == 0)
>>> + src_list[0].len |= HASH_SG_LAST_LIST;
>>> +
>>> + src_list[0].phy_addr = cpu_to_le32(src_list[0].phy_addr);
>>> + src_list[0].len = cpu_to_le32(src_list[0].len);
>>> + src_list++;
>>> + }
>>> +
>>> + if (length != 0) {
>>> + for_each_sg(rctx->src_sg, s, sg_len, i) {
>>> + src_list[i].phy_addr = sg_dma_address(s);
>>> +
>>> + if (length > sg_dma_len(s)) {
>>> + src_list[i].len = sg_dma_len(s);
>>> + length -= sg_dma_len(s);
>>> +
>>> + } else {
>>> + /* Last sg list */
>>> + src_list[i].len = length;
>>> + src_list[i].len |= HASH_SG_LAST_LIST;
>>> + length = 0;
>>> + }
>>> +
>>> + src_list[i].phy_addr = cpu_to_le32(src_list[i].phy_addr);
>>> + src_list[i].len = cpu_to_le32(src_list[i].len);
>>> + }
>>> + }
>>> +
>>> + if (length != 0) {
>>> + rc = -EINVAL;
>>> + goto free_rctx_buffer;
>>> + }
>>> +
>>> + rctx->offset = rctx->total - remain;
>>> + hash_engine->src_length = rctx->total + rctx->bufcnt - remain;
>>> + hash_engine->src_dma = hash_engine->ahash_src_dma_addr;
>>> + hash_engine->digest_dma = rctx->digest_dma_addr;
>>> +
>>> + goto end;
>> Exiting via "goto xx" is not recommended in normal code logic (this requires
>> two jumps),
>> exiting via "return 0" is more efficient.
>> This code method has many times in your entire driver, it is recommended to
>> modify it.
>
> If not exiting via "goto xx", how to release related resources without any problem?
> Is there any proper way to do this?
maybe I didn't describe it clearly enough.
"in normal code logic" means rc=0
In this scenario (rc=0), "goto xx" is no longer required,
it can be replaced with "return 0"
>
>>> +
>>> +free_rctx_buffer:
>>> + if (rctx->bufcnt != 0)
>>> + dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>> + rctx->block_size * 2, DMA_TO_DEVICE);
>>> +free_rctx_digest:
>>> + dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>> + SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>> +free_src_sg:
>>> + dma_unmap_sg(hace_dev->dev, rctx->src_sg, rctx->src_nents,
>>> + DMA_TO_DEVICE);
>>> +end:
>>> + return rc;
>>> +}
>>> +
>>> +static int aspeed_ahash_complete(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> + struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> + struct ahash_request *req = hash_engine->req;
>>> +
>>> + AHASH_DBG(hace_dev, "\n");
>>> +
>>> + hash_engine->flags &= ~CRYPTO_FLAGS_BUSY;
>>> +
>>> + crypto_finalize_hash_request(hace_dev->crypt_engine_hash, req, 0);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +/*
>>> + * Copy digest to the corresponding request result.
>>> + * This function will be called at final() stage.
>>> + */
>>> +static int aspeed_ahash_transfer(struct aspeed_hace_dev *hace_dev)
>>> +{
>>> + struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> + struct ahash_request *req = hash_engine->req;
>>> + struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +
>>> + AHASH_DBG(hace_dev, "\n");
>>> +
>>> + dma_unmap_single(hace_dev->dev, rctx->digest_dma_addr,
>>> + SHA512_DIGEST_SIZE, DMA_BIDIRECTIONAL);
>>> +
>>> + dma_unmap_single(hace_dev->dev, rctx->buffer_dma_addr,
>>> + rctx->block_size * 2, DMA_TO_DEVICE);
>>> +
>>> + memcpy(req->result, rctx->digest, rctx->digsize);
>>> +
>>> + return aspeed_ahash_complete(hace_dev);
>>> +}
>>> +
>>> +/*
>>> + * Trigger hardware engines to do the math.
>>> + */
>>> +static int aspeed_hace_ahash_trigger(struct aspeed_hace_dev *hace_dev,
>>> + aspeed_hace_fn_t resume)
>>> +{
>>> + struct aspeed_engine_hash *hash_engine = &hace_dev->hash_engine;
>>> + struct ahash_request *req = hash_engine->req;
>>> + struct aspeed_sham_reqctx *rctx = ahash_request_ctx(req);
>>> +
>>> + AHASH_DBG(hace_dev, "src_dma:0x%x, digest_dma:0x%x,
>> length:0x%x\n",
>>> + hash_engine->src_dma, hash_engine->digest_dma,
>>> + hash_engine->src_length);
>>> +
>>> + rctx->cmd |= HASH_CMD_INT_ENABLE;
>>> + hash_engine->resume = resume;
>>> +
>>> + ast_hace_write(hace_dev, hash_engine->src_dma,
>> ASPEED_HACE_HASH_SRC);
>>> + ast_hace_write(hace_dev, hash_engine->digest_dma,
>>> + ASPEED_HACE_HASH_DIGEST_BUFF);
>>> + ast_hace_write(hace_dev, hash_engine->digest_dma,
>>> + ASPEED_HACE_HASH_KEY_BUFF);
>>> + ast_hace_write(hace_dev, hash_engine->src_length,
>>> + ASPEED_HACE_HASH_DATA_LEN);
>>> +
>>> + /* Memory barrier to ensure all data setup before engine starts */
>>> + mb();
>>> +
>>> + ast_hace_write(hace_dev, rctx->cmd, ASPEED_HACE_HASH_CMD);
>> A hardware service sending requires 5 hardware commands to complete.
>> In a multi-concurrency scenario, how to ensure the order of commands?
>> (If two processes send hardware task at the same time,
>> How to ensure that the hardware recognizes which task the current
>> command belongs to?)
>
> Linux crypto engine would guarantee that only one request at each time to be dequeued from engine queue to process.
> And there has lock mechanism inside Linux crypto engine to prevent the scenario you mentioned.
> So only 1 aspeed_hace_ahash_trigger() hardware service would go through at a time.
>
> [...]
> .
>
You may not understand what I mean, the command flow in a normal scenario:
request_A: Acmd1-->Acmd2-->Acmd3-->Acmd4-->Acmd5
request_B: Bcmd1-->Bcmd2-->Bcmd3-->Bcmd4-->Bcmd5
In a multi-process concurrent scenario, multiple crypto engines can be enabled,
and each crypto engine sends a request. If multiple requests here enter
aspeed_hace_ahash_trigger() at the same time, the command flow will be
intertwined like this:
request_A, request_B: Acmd1-->Bcmd1-->Acmd2-->Acmd3-->Bcmd2-->Acmd4-->Bcmd3-->Bcmd4-->Acmd5-->Bcmd5

In this command flow, how does your hardware identify whether these commands
belong to request_A or request_B?
Thanks.
Longfang.