Re: [PATCH 0/4] nolibc: add support for the s390 platform

From: Paul E. McKenney
Date: Tue Jan 10 2023 - 16:30:44 EST


On Tue, Jan 10, 2023 at 06:53:47PM +0100, Willy Tarreau wrote:
> On Tue, Jan 10, 2023 at 08:32:10AM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 10, 2023 at 05:12:49PM +0100, Willy Tarreau wrote:
> > > On Tue, Jan 10, 2023 at 06:53:34AM -0800, Paul E. McKenney wrote:
> > > > Here is one of them, based on both the fixes and Sven's s390 support.
> > > > Please let me know if you need any other combination.
> > >
> > > Thanks, here's the problem:
> > >
> > > > 0 getpid = 1 [OK]
> > > > 1 getppid = 0 [OK]
> > > > 3 gettid = 1 [OK]
> > > > 5 getpgid_self = 0 [OK]
> > > > 6 getpgid_bad = -1 ESRCH [OK]
> > > > 7 kill_0[ 1.940442] tsc: Refined TSC clocksource calibration: 2399.981 MHz
> > > > [ 1.942334] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x229825a5278, max_idle_ns: 440795306804 ns
> > > > = 0 [OK]
> > > > 8 kill_CONT = 0 [ 1.944987] clocksource: Switched to clocksource tsc
> > > > [OK]
> > > > 9 kill_BADPID = -1 ESRCH [OK]
> > > (...)
> > >
> > > It's clear that "grep -c ^[0-9].*OK" will not count all of them (2 are
> > > indeed missing).
> > >
> > > We could probably start with "quiet" but that would be against the
> > > principle of using this to troubleshoot issues. I think we just stick
> > > to the current search of "FAIL" and that as long as a success is
> > > reported and the number of successes is within the expected range
> > > that could be OK. At least I guess :-/
> >
> > Huh. Would it make sense to delay the start of the nolibc testing by a
> > few seconds in order to avoid this sort of thing? Or would that cause
> > other problems?
>
> That would be quite annoying. Delaying is never long enough for some
> issues, too long for the majority of cases where there is no issue. I'd
> suggest that we just rely on the fail count for now (as it is) and that
> will allow us to collect a larger variety of discrepancies and probably
> figure a better solution at some point. For example if we find that it's
> always the TSC that does this, maybe starting x86 with notsc will be a
> good fix.

Sounds good to me!

Thanx, Paul