On Fri, 2022-05-20 at 15:36 +0000, Chuck Lever III wrote:
Uh... What happens if you have 2 simultaneous calls to
On May 11, 2022, at 10:36 AM, Chuck Lever IIIWe believe the following, which should be part of the first
<chuck.lever@xxxxxxxxxx> wrote:
On May 11, 2022, at 10:23 AM, Greg KHI don't have a lot of time to backport this one myself, so
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
On Wed, May 11, 2022 at 02:16:19PM +0000, Chuck Lever III wrote:
And into 5.17.4 if someone wants to try that release.
On May 11, 2022, at 8:38 AM, Greg KHWe believe that
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
On Wed, May 11, 2022 at 12:03:13PM +0200, Wolfgang Walter
wrote:
Hi,Odds are 5.18-rc6 is also a problem?
starting with 5.4.188 wie see a massive performance
regression on our
nfs-server. It basically is serving requests very very
slowly with cpu
utilization of 100% (with 5.4.187 and earlier it is 10%) so
that it is
unusable as a fileserver.
The culprit are commits (or one of it):
c32f1041382a88b17da5736886da4a492353a1bb "nfsd: cleanup
nfsd_file_lru_dispose()"
628adfa21815f74c04724abc85847f24b5dd1645 "nfsd:
Containerise filecache
laundrette"
(upstream 36ebbdb96b694dd9c6b25ad98f2bbd263d022b63 and
9542e6a643fc69d528dfb3303f145719c61d3050)
If I revert them in v5.4.192 the kernel works as before and
performance is
ok again.
I did not try to revert them one by one as any disruption
of our nfs-server
is a severe problem for us and I'm not sure if they are
related.
5.10 and 5.15 both always performed very badly on our nfs-
server in a
similar way so we were stuck with 5.4.
I now think this is because of
36ebbdb96b694dd9c6b25ad98f2bbd263d022b63
and/or 9542e6a643fc69d528dfb3303f145719c61d3050 though I
didn't tried to
revert them in 5.15 yet.
6b8a94332ee4 ("nfsd: Fix a write performance regression")
addresses the performance regression. It was merged into 5.18-
rc.
I welcome anyone who wants to apply that commit to their
favorite LTS kernel and test it for us.
Not yet. I was at LSF last week, so I've just started diggingIck, not good, any potential fixes for that?If so, I'll just wait for the fix to get into Linus's tree asUnfortunately I've received a recent report that the fix
this does
not seem to be a stable-tree-only issue.
introduces
a "sleep while spinlock is held" for NFSv4.0 in rare cases.
into this one. I've confirmed that the report is a real bug,
but we still don't know how hard it is to hit it with real
workloads.
NFSD pull request for 5.19, will properly address the splat.
https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=for-next&id=556082f5e5d7ecfd0ee45c3641e2b364bff9ee44
nfsd4_release_lockowner() for the same file? i.e. 2 separate processes
owned by the same user, both locking the same file.
Can't that cause the 'putlist' to get corrupted when both callers add
the same nf->nf_putfile to two separate lists?
--
Chuck Lever