RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

From: Nic Percival
Date: Tue May 05 2015 - 05:56:30 EST


Michael is correct.
Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on.
The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed.
This is communicated to the debugger with pseudo-tty's.
The state of this user screen is checked as part of this (and other) tests.

The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written.

What is worse is its non-deterministic. Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass.
We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

Thanks,
Nic

-----Original Message-----
From: Michael Matz [mailto:matz@xxxxxxx]
Sent: 04 May 2015 13:24
To: Peter Hurley
Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@xxxxxxxxxxxxxxx
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

Hi,

On Fri, 1 May 2015, Peter Hurley wrote:

> I don't think this a real bug, in the sense that pty i/o is not
> synchronous, in the same way that tty i/o is not synchronous.

Here's what I wrote internally about my speculations about this being a bug or not:

> > I also never hit it with pipes (remove the USEPTY define), also not
> > on sle12, so it must be some change specific to the pty implementation.
> >
> > Now, all of this is of course unspecified. There are two
> > asynchronous processes involved, and a buffered tube between them.
> > Just because one process filled one end of the tube (the breakpoint
> > was hit) doesn't mean the contents have to appear at that instant at
> > the other end. So the change in behaviour in sle12 is not a genuine
> > bug. It _might_ be an unintented change, though, that's why kernel
> > people should comment on this. If there are no terribly good
> > reasons for this change I'd consider it a quality-of-implementation
> > regression in sle12.

So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team.

> However, that said, if this is a regression (regression as in "it
> broke something that used to work", not regression as in "this new
> thing I'm writing doesn't behave the way I want it to" :) )
>
> Help me understand the use-case here: are you using pty i/o to debug
> the debugger?

Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch.

(FWIW: I do think it's a better QoI factor if something returns data from a tube if we can know via side channels (break points) that something must have been written locally to the other end of the tube, if that can be ensured without too much other work)


Ciao,
Michael.


This message has been scanned for malware by Websense. www.websense.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/