Re: [PATCH 2/3] arm64: dts: qcom: sa8155: Add gear and rate limit properties to UFS

From: Nitin Rawat
Date: Mon Aug 11 2025 - 17:46:15 EST




On 8/9/2025 4:43 PM, 'Manivannan Sadhasivam' wrote:
On Sat, Aug 09, 2025 at 06:30:29AM GMT, Alim Akhtar wrote:

[...]

I understand that this is a static configuration, where it
is already known
that board is broken for higher Gear.
Can this be achieved by limiting the clock? If not, can we
add a board
specific _quirk_ and let the _quirk_ to be enabled from
vendor specific hooks?


How can we limit the clock without limiting the gears? When
we limit the gear/mode, both clock and power are implicitly
limited.

Possibly someone need to check with designer of the SoC if
that is possible
or not.

It's not just clock. We need to consider reducing regulator,
interconnect votes also. But as I said above, limiting the
gear/mode will take care of all these parameters.

Did we already tried _quirk_? If not, why not?
If the board is so poorly designed and can't take care of the
channel loses or heat dissipation etc, Then I assumed the gear
negotiation between host and device should fail for the higher
gear and driver can have
a re-try logic to re-init / re-try "power mode change" at the
lower gear. Is that not possible / feasible?


I don't see why we need to add extra logic in the UFS driver if
we can extract that information from DT.

You didn’t answer my question entirely, I am still not able to
visualised how come Linkup is happening in higher gear and then
Suddenly
it is failing and we need to reduce the gear to solve that?

Oh well, this is the source of confusion here. I didn't (also the
patch) claim that the link up will happen with higher speed. It will
most likely fail if it couldn't operate at the higher speed and
that's why we need to limit it to lower gear/mode *before* bringing the
link up.

Right, that's why a re-try logic to negotiate a __working__ power mode
change can help, instead of introducing new binding for this case.

Retry logic is already in place in the ufshcd core, but with this kind of signal
integrity issue, we cannot guarantee that it will gracefully fail and then we
could retry. The link up *may* succeed, then it could blow up later also
(when doing heavy I/O operations etc...). So with this non-deterministic
behavior, we cannot rely on this logic.

I would image in that case , PHY tuning / programming is not proper.

I don't have the insight into the PHY tuning to avoid this issue. Maybe Nitin or
Ram can comment here. But PHY tuning is mostly SoC specific in the PHY driver.
We don't have board level tuning sequence AFIAK.

Hi Alim and Mani,

Here's my take:

There can be multiple reasons for limiting the gear/rate on a customer board beyond PHY tuning issues:

1. Board-level signal integrity concerns
2. Channel or reference clock configuration issues
3. Customer board layout not meeting layout design guidelines

This becomes especially critical in automotive platforms like the SA8155, as mentioned by Ram. In such safety-critical applications, customer prioritize reliability over peak performance, and hence customers are generally comfortable operating at lower gears if stability is ensured.

For the current case customer had some issue #1 at their end(though don't have complete details)

As Mani pointed out, issues are more likely to surface under stress conditions rather than during link startup. Therefore, IMHO if any limitations are known, it's advisable to restrict the gear/rate during initialization to avoid potential problems later.

Moreover, introducing quirks for such cases isn’t very effective, as it requires specifying the exact gear/rate to be limited—which can vary significantly across different targets.

Regards,
Nitin



And that approach can be useful for many platforms.

Other platforms could also reuse the same DT properties to workaround
similar issues.

Anyway coming back with the same point again and again is not productive.
I gave my opinion and suggestions. Rest is on the maintainers.

Suggestions are always welcomed. It is important to have comments to try
out different things instead of sticking to the proposed solution. But in my
opinion, the retry logic is not reliable in this case. Moreover, we do have
similar properties for other peripherals like PCIe, MMC, where the vendors
would use DT properties to limit the speed to workaround the board issues.
So we are not doing anything insane here.

If there are better solutions than what is proposed here, we would indeed
like to hear.

For that, more _technical_ things need to be discussed (e.g. Is it the PHY which has problem, or problem is happening at unipro level or somewhere else),
I didn't saw any technical backing from the patch Author/Submitter
(I assume Author should be knowing a bit more in-depth then what we are assuming and discussing here).


Nitin/Ram, please share more details on what level the customer is facing the
issue.

- Mani