RE: [PATCH] nvme: Change our APST table to be no more aggressive than Intel RSTe

From: Mario.Limonciello
Date: Mon May 15 2017 - 12:11:54 EST


> -----Original Message-----
> From: Andy Lutomirski [mailto:luto@xxxxxxxxxx]
> Sent: Saturday, May 13, 2017 7:28 AM
> To: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Jens Axboe <axboe@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; Kai-Heng Feng
> <kai.heng.feng@xxxxxxxxxxxxx>; linux-nvme <linux-nvme@xxxxxxxxxxxxxxxxxxx>;
> Christoph Hellwig <hch@xxxxxx>; Sagi Grimberg <sagi@xxxxxxxxxxx>; Keith Busch
> <keith.busch@xxxxxxxxx>; Limonciello, Mario <Mario_Limonciello@xxxxxxxx>
> Subject: Re: [PATCH] nvme: Change our APST table to be no more aggressive than
> Intel RSTe
>
> On Thu, May 11, 2017 at 9:06 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > It seems like RSTe is much more conservative with transition timing
> > that we are. According to Mario, RSTe programs APST to transition from
> > active states to the first idle state after 60ms and, thereafter, to
> > 1000 * the exit latency of the target state.
>
> Bad news, folks: this appears to be merely more stable, not all the way stable:
>
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184/comments/65
>
> I maintain my hypothesis that no one ever validated these disks and
> that the very conservative parameters set by RSTe merely make it rare
> to trigger the bug. But maybe something else is going on.

This is really unfortunate to hear. I think the conservative parameters set
by Intel are still best though.

I've been talking to folks about this. There has been mentions to a possible
signal integrity issue specifically on the quirked Dell systems and how it
relates to this. The current (working) theory is that when the drive is in PS4
and is supposed to transition back that crosstalk causes problems with the
link negotiation and thus fails.
So there's two possible ways I see to approach solving this (from Linux side):

1) Keep quirking those systems from going into PS4.
This isn't ideal as the jump to PS4 gets you the most power savings, but of course
stable system > power savings

2) Quirk those systems to redo link negotiation a few times if it fails
I don't know if this is actually possible. Where is link negotiation invoked?

If our partners come up with a way to solve this from drive firmware though
I'll let this group know.

Thanks,