Re: Debugging Thinkpad T430s occasional suspend failure.

From: Linus Torvalds
Date: Wed Feb 13 2013 - 14:56:56 EST

Next message: smakarov: "systemtap release 2.1"
Previous message: Paolo Bonzini: "Re: [PATCH] x86: Lock down MSR writing in secure boot"
In reply to: Dave Jones: "Re: Debugging Thinkpad T430s occasional suspend failure."
Next in thread: Dave Jones: "Re: Debugging Thinkpad T430s occasional suspend failure."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Feb 13, 2013 at 11:34 AM, Dave Jones <davej@xxxxxxxxxx> wrote:
>
> My test was a loop of 100 suspend/resume cycles before calling something
> 'good'. The 'bad' cases all failed within 10 cycles (usually 2-3).

Considering that you apparently already found one case where the BIOS
crapped out due to effectively unrelated timing details (ie timing
triggered a temperature issue that then triggered behavioral changes),
I wonder if your more occasional problem might not be a sign of
something similar.

But since you seem to be able to automate it well, maybe one thing to
try is to change the timing a bit while testing. Maybe some failures
were hidden by the timing just happening to work out.

Also, as I suspect you're aware: since the "bad" markings for "git
bisect" are presumably reliable (with the caveat that you need to
worry about the exact symptoms and not mix it up with some other
independent bug, as you already found out), you can usually speed up
repeated bisects by using the last bad information from the previous
bisect.

Note that there is only ever one "bad" commit - since all the commits
you test while bisecting are by definition reachable from the previous
bad one and both contain the bug, picking a bad commit makes all other
previous bad commits uninteresting. So you just need to look at the
last bad commit, not the whole set of bad commits. So when re-doing
the bisect, and if you trust that your bad kernels really were bad and
had the *right* badness, you can just start with "git bisect bad
<last-bad-commit>"

(good commits, on the other hand, are independent of each other: "not
containing the bug" is not some kind of exclusivity test, so finding
one good kernel doesn't make the information about other good kernels
irrelevant)

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: smakarov: "systemtap release 2.1"
Previous message: Paolo Bonzini: "Re: [PATCH] x86: Lock down MSR writing in secure boot"
In reply to: Dave Jones: "Re: Debugging Thinkpad T430s occasional suspend failure."
Next in thread: Dave Jones: "Re: Debugging Thinkpad T430s occasional suspend failure."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]