Re: TSC x86 fixes for LTS kernel 4.9.x

From: Greg KH
Date: Wed Dec 13 2017 - 04:57:58 EST


On Wed, Dec 13, 2017 at 11:45:20AM +0200, Dan Aloni wrote:
> On Wed, Dec 13, 2017 at 10:03:35AM +0100, Greg KH wrote:
> > On Wed, Dec 13, 2017 at 10:33:52AM +0200, Dan Aloni wrote:
> > > Hi all,
> > >
> > > I've tested the following changes, belonging to merge commit f7dd3b1734e,
> > > on top of 4.9.68 after a very easy backport from 4.10, and I think it
> > > may be worthwhile adding them to 4.9.x:
> > >
> [..]
> >
> > I need git commit ids to be able to do anything :)
>
> Sure, how about:
>
> # git log 8c9b9d87b855 --oneline -n 19 --reverse --pretty="%h # %s" | awk -F" " '{print "git cherry-pick -x " $0}'
>
> git cherry-pick -x 47c95a46d0fa # x86/tsc: Add X86_FEATURE_TSC_KNOWN_FREQ flag
> git cherry-pick -x 4ca4df0b7eb0 # x86/tsc: Mark TSC frequency determined by CPUID as known
> git cherry-pick -x 4635fdc696a8 # x86/tsc: Mark Intel ATOM_GOLDMONT TSC reliable
> git cherry-pick -x f3a02ecebed7 # x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCs
> git cherry-pick -x 984fecebda3b # x86/tsc: Finalize the split of the TSC_RELIABLE flag
> git cherry-pick -x 7b3d2f6e08ed # x86/tsc: Use X86_FEATURE_TSC_ADJUST in detect_art()
> git cherry-pick -x bec8520dca0d # x86/tsc: Detect random warps
> git cherry-pick -x 8b223bc7abe0 # x86/tsc: Store and check TSC ADJUST MSR
> git cherry-pick -x 1d0095feea59 # x86/tsc: Verify TSC_ADJUST from idle
> git cherry-pick -x a36f5136814b # x86/tsc: Sync test only for the first cpu in a package
> git cherry-pick -x 4c5e3c637521 # x86/tsc: Move sync cleanup to a safe place
> git cherry-pick -x 76d3b8515850 # x86/tsc: Prepare warp test for TSC adjustment
> git cherry-pick -x cc4db26899dc # x86/tsc: Try to adjust TSC if sync test fails
> git cherry-pick -x b836554386cc # x86/tsc: Fix broken CONFIG_X86_TSC=n build
> git cherry-pick -x 31f8a651fc57 # x86/tsc: Validate cpumask pointer before accessing it
> git cherry-pick -x 6a369583178d # x86/tsc: Validate TSC_ADJUST after resume
> git cherry-pick -x 5bae156241e0 # x86/tsc: Force TSC_ADJUST register to value >= zero
> git cherry-pick -x 16588f659257 # x86/tsc: Annotate printouts as firmware bug
> git cherry-pick -x 8c9b9d87b855 # x86/tsc: Limit the adjust value further
>
> There's a conflict only in a one small place in the first few patches.

That's a lot of changes to be backported. I'm _really_ hesitant to do
this, unless the maintainer of the code agrees it is ok...

> > > These changes percisely fix an issue I am having with a relatively new
> > > 8-core Intel(R) Core(TM) i7-7820X with an updated ASUS BIOS (December 2017).
> > >
> > > Under v4.9.68, the kernel fallbacks on the chosen clocksource to HPET which
> > > just doesn't work - there is over a 200ms time drift that does not go
> > > away even after repeated ntpdate sync attempts.
> > >
> > > For further testing I've posted a branch for these changes here:
> > >
> > > https://github.com/kernelim/linux tsc-fix-for-4.9.x
> >
> > Why not just use 4.14 instead? That's much easier than trying to use an
> > old kernel like 4.9, right?
>
> Yes, however the milage of 4.9.x seems more appealing somewhat.

Why? 4.14 should be much better, it's newer, has more hardware support,
more bugs fixed, and more new things left to debug :)

> I'll give 4.14.x a try mostly to see whether it solves hard locks that
> I've seen with 4.13.x (all Fedora-based stable kernels) on three of my
> machines -- an unrelated issue, and the main reason why I gave one of
> the LTS branches a try.

You really should report that. Without that, odds are it will not be
fixed.

thanks,

greg k-h