Re: [PATCH v3 3/4] iommu/arm-smmu: Add support to handle Qcom's wait-for-safe logic

From: Bjorn Andersson
Date: Fri Jun 14 2019 - 00:13:11 EST


On Wed 12 Jun 00:15 PDT 2019, Vivek Gautam wrote:

> Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
> to address under-performance issues in real-time clients, such as
> Display, and Camera.
> On receiving an invalidation requests, the SMMU forwards SAFE request
> to these clients and waits for SAFE ack signal from real-time clients.
> The SAFE signal from such clients is used to qualify the start of
> invalidation.
> This logic is controlled by chicken bits, one for each - MDP (display),
> IFE0, and IFE1 (camera), that can be accessed only from secure software
> on sdm845.
>
> This configuration, however, degrades the performance of non-real time
> clients, such as USB, and UFS etc. This happens because, with wait-for-safe
> logic enabled the hardware tries to throttle non-real time clients while
> waiting for SAFE ack signals from real-time clients.
>
> On MTP sdm845 devices, with wait-for-safe logic enabled at the boot time
> by the bootloaders we see degraded performance of USB and UFS when kernel
> enables the smmu stage-1 translations for these clients.
> Turn off this wait-for-safe logic from the kernel gets us back the perf
> of USB and UFS devices until we re-visit this when we start seeing perf
> issues on display/camera on upstream supported SDM845 platforms.
>
> Now, different bootloaders with their access control policies allow this
> register access differently through secure monitor calls -
> 1) With one we can issue io-read/write secure monitor call (qcom-scm)
> to update the register, while,
> 2) With other, such as one on MTP sdm845 we should use the specific
> qcom-scm command to send request to do the complete register
> configuration.
> Adding a separate device tree flag for arm-smmu to identify which
> firmware configuration of the two mentioned above we use.
> Not adding code change to allow type-(1) bootloaders to toggle the
> safe using io-read/write qcom-scm call.
>
> This change is inspired by the downstream change from Patrick Daly
> to address performance issues with display and camera by handling
> this wait-for-safe within separte io-pagetable ops to do TLB
> maintenance. So a big thanks to him for the change.
>
> Without this change the UFS reads are pretty slow:
> $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
> real 0m 22.39s
> user 0m 0.00s
> sys 0m 0.01s
>
> With this change they are back to rock!
> $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
> 300+0 records in
> 300+0 records out
> 314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
> real 0m 1.03s
> user 0m 0.00s
> sys 0m 0.54s
>
> Signed-off-by: Vivek Gautam <vivek.gautam@xxxxxxxxxxxxxx>
> ---
> drivers/iommu/arm-smmu.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 0ad086da399c..3c3ad43eda97 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -39,6 +39,7 @@
> #include <linux/pci.h>
> #include <linux/platform_device.h>
> #include <linux/pm_runtime.h>
> +#include <linux/qcom_scm.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
>
> @@ -177,6 +178,7 @@ struct arm_smmu_device {
> u32 features;
>
> #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> +#define ARM_SMMU_OPT_QCOM_FW_IMPL_SAFE_ERRATA (1 << 1)
> u32 options;
> enum arm_smmu_arch_version version;
> enum arm_smmu_implementation model;
> @@ -262,6 +264,7 @@ static bool using_legacy_binding, using_generic_binding;
>
> static struct arm_smmu_option_prop arm_smmu_options[] = {
> { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
> + { ARM_SMMU_OPT_QCOM_FW_IMPL_SAFE_ERRATA, "qcom,smmu-500-fw-impl-safe-errata" },

This should be added to the DT binding as well.

> { 0, NULL},
> };
>
> @@ -2292,6 +2295,19 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
> arm_smmu_device_reset(smmu);
> arm_smmu_test_smr_masks(smmu);
>
> + /*
> + * To address performance degradation in non-real time clients,
> + * such as USB and UFS, turn off wait-for-safe on sdm845 platforms,
> + * such as MTP, whose firmwares implement corresponding secure monitor
> + * call handlers.
> + */
> + if (of_device_is_compatible(smmu->dev->of_node, "qcom,sdm845-smmu-500") &&
> + smmu->options & ARM_SMMU_OPT_QCOM_FW_IMPL_SAFE_ERRATA) {
> + err = qcom_scm_qsmmu500_wait_safe_toggle(0);
> + if (err)
> + dev_warn(dev, "Failed to turn off SAFE logic\n");
> + }
> +

This looks good, I presume at some point we can profile things and
review if it's worth toggling this on the fly, but given that this is
conditioned on smmu->options that should be an implementation detail..

Reviewed-by: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>

Regards,
Bjorn

> /*
> * We want to avoid touching dev->power.lock in fastpaths unless
> * it's really going to do something useful - pm_runtime_enabled()
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation
>