Re: [RFC v4 08/17] kunit: test: add support for test abort

From: Knut Omang
Date: Fri Mar 22 2019 - 03:14:56 EST


On Thu, 2019-03-21 at 18:41 -0700, Brendan Higgins wrote:
> On Thu, Mar 21, 2019 at 6:10 PM Frank Rowand <frowand.list@xxxxxxxxx> wrote:
> > On 2/27/19 11:42 PM, Brendan Higgins wrote:
> > > On Tue, Feb 19, 2019 at 10:44 PM Frank Rowand <frowand.list@xxxxxxxxx>
> > > wrote:
> > > > On 2/19/19 7:39 PM, Brendan Higgins wrote:
> > > > > On Mon, Feb 18, 2019 at 11:52 AM Frank Rowand <frowand.list@xxxxxxxxx>
> > > > > wrote:
> > > > > > On 2/14/19 1:37 PM, Brendan Higgins wrote:
> > > > > > > Add support for aborting/bailing out of test cases. Needed for
> > > > > > > implementing assertions.
> > > > > > >
> > > > > > > Signed-off-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> > > > > > > ---
> > > > > > > Changes Since Last Version
> > > > > > > - This patch is new introducing a new cross-architecture way to
> > > > > > > abort
> > > > > > > out of a test case (needed for KUNIT_ASSERT_*, see next patch
> > > > > > > for
> > > > > > > details).
> > > > > > > - On a side note, this is not a complete replacement for the UML
> > > > > > > abort
> > > > > > > mechanism, but covers the majority of necessary functionality.
> > > > > > > UML
> > > > > > > architecture specific featurs have been dropped from the
> > > > > > > initial
> > > > > > > patchset.
> > > > > > > ---
> > > > > > > include/kunit/test.h | 24 +++++
> > > > > > > kunit/Makefile | 3 +-
> > > > > > > kunit/test-test.c | 127 ++++++++++++++++++++++++++
> > > > > > > kunit/test.c | 208
> > > > > > > +++++++++++++++++++++++++++++++++++++++++--
> > > > > > > 4 files changed, 353 insertions(+), 9 deletions(-)
> > > > > > > create mode 100644 kunit/test-test.c
> > > > > >
> > > > > > < snip >
> > > > > >
> > > > > > > diff --git a/kunit/test.c b/kunit/test.c
> > > > > > > index d18c50d5ed671..6e5244642ab07 100644
> > > > > > > --- a/kunit/test.c
> > > > > > > +++ b/kunit/test.c
> > > > > > > @@ -6,9 +6,9 @@
> > > > > > > * Author: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> > > > > > > */
> > > > > > >
> > > > > > > -#include <linux/sched.h>
> > > > > > > #include <linux/sched/debug.h>
> > > > > > > -#include <os.h>
> > > > > > > +#include <linux/completion.h>
> > > > > > > +#include <linux/kthread.h>
> > > > > > > #include <kunit/test.h>
> > > > > > >
> > > > > > > static bool kunit_get_success(struct kunit *test)
> > > > > > > @@ -32,6 +32,27 @@ static void kunit_set_success(struct kunit
> > > > > > > *test, bool success)
> > > > > > > spin_unlock_irqrestore(&test->lock, flags);
> > > > > > > }
> > > > > > >
> > > > > > > +static bool kunit_get_death_test(struct kunit *test)
> > > > > > > +{
> > > > > > > + unsigned long flags;
> > > > > > > + bool death_test;
> > > > > > > +
> > > > > > > + spin_lock_irqsave(&test->lock, flags);
> > > > > > > + death_test = test->death_test;
> > > > > > > + spin_unlock_irqrestore(&test->lock, flags);
> > > > > > > +
> > > > > > > + return death_test;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static void kunit_set_death_test(struct kunit *test, bool
> > > > > > > death_test)
> > > > > > > +{
> > > > > > > + unsigned long flags;
> > > > > > > +
> > > > > > > + spin_lock_irqsave(&test->lock, flags);
> > > > > > > + test->death_test = death_test;
> > > > > > > + spin_unlock_irqrestore(&test->lock, flags);
> > > > > > > +}
> > > > > > > +
> > > > > > > static int kunit_vprintk_emit(const struct kunit *test,
> > > > > > > int level,
> > > > > > > const char *fmt,
> > > > > > > @@ -70,13 +91,29 @@ static void kunit_fail(struct kunit *test,
> > > > > > > struct kunit_stream *stream)
> > > > > > > stream->commit(stream);
> > > > > > > }
> > > > > > >
> > > > > > > +static void __noreturn kunit_abort(struct kunit *test)
> > > > > > > +{
> > > > > > > + kunit_set_death_test(test, true);
> > > > > > > +
> > > > > > > + test->try_catch.throw(&test->try_catch);
> > > > > > > +
> > > > > > > + /*
> > > > > > > + * Throw could not abort from test.
> > > > > > > + */
> > > > > > > + kunit_err(test, "Throw could not abort from test!");
> > > > > > > + show_stack(NULL, NULL);
> > > > > > > + BUG();
> > > > > >
> > > > > > kunit_abort() is what will be call as the result of an assert
> > > > > > failure.
> > > > >
> > > > > Yep. Does that need clarified somewhere.
> > > > > > BUG(), which is a panic, which is crashing the system is not
> > > > > > acceptable
> > > > > > in the Linux kernel. You will just annoy Linus if you submit this.
> > > > >
> > > > > Sorry, I thought this was an acceptable use case since, a) this should
> > > > > never be compiled in a production kernel, b) we are in a pretty bad,
> > > > > unpredictable state if we get here and keep going. I think you might
> > > > > have said elsewhere that you think "a" is not valid? In any case, I
> > > > > can replace this with a WARN, would that be acceptable?
> > > >
> > > > A WARN may or may not make sense, depending on the context. It may
> > > > be sufficient to simply report a test failure (as in the old version
> > > > of case (2) below.
> > > >
> > > > Answers to "a)" and "b)":
> > > >
> > > > a) it might be in a production kernel
> > >
> > > Sorry for a possibly stupid question, how might it be so? Why would
> > > someone intentionally build unit tests into a production kernel?
> >
> > People do things. Just expect it.
>
> Huh, alright. I will take your word for it then.

I have a better explanation: Production kernels have bugs, unfortunately.
And sometimes those need to be investigated on systems than cannot be
brought down or affected more than absolutely necessary, maybe via a third party
doing the execution. A light weight, precise test (well tested ahead :) ) might
be a way of proving or disproving assumptions that can lead to the development
and application of a fix.

IMHO you're confusing "building into" with temporary applying, then removing
again - like the difference between running a local user space program vs
installing it under /usr and have it in everyone's PATH.

> > > > a') it is not acceptable in my development kernel either

I think one of the fundamental properties of a good test framework is that it
should not require changes to the code under test by itself.

Knut

> > > Fair enough.
> > >
> > > > b) No. You don't crash a developer's kernel either unless it is
> > > > required to avoid data corruption.
> > > Alright, I thought that was one of those cases, but I am not going to
> > > push the point. Also, in case it wasn't clear, the path where BUG()
> > > gets called only happens if there is a bug in KUnit itself, not just
> > > because a test case fails catastrophically.
> >
> > Still not out of the woods. Still facing Lions and Tigers and Bears,
> > Oh my!
>
> Nope, I guess not :-)
>
> > So kunit_abort() is normally called as the result of an assert
> > failure (as written many lines further above).
> >
> > kunit_abort()
> > test->try_catch.throw(&test->try_catch)
> > // this is really kunit_generic_throw(), yes?
> > complete_and_exit()
> > if (comp)
> > // comp is test_case_completion?
> > complete(comp)
> > do_exit()
> > // void __noreturn do_exit(long code)
> > // depending on the task, either panic
> > // or the task dies
>
> You are right up until after it calls do_exit().
>
> KUnit actually spawns a thread for the test case to run in so that
> when exit is called, only the test case thread dies. The thread that
> started KUnit is never affected.
>
> > I did not read through enough of the code to understand what is going
> > on here. Is each kunit_module executed in a newly created thread?
> > And if kunit_abort() is called then that thread dies? Or something
> > else?
>
> Mostly right, each kunit_case (not kunit_module) gets executed in its
> own newly created thread. If kunit_abort() is called in that thread,
> the kunit_case thread dies. The parent thread keeps going, and other
> test cases are executed.
>
> >
> > > > b') And you can not do replacements like:
> > > >
> > > > (1) in of_unittest_check_tree_linkage()
> > > >
> > > > ----- old -----
> > > >
> > > > if (!of_root)
> > > > return;
> > > >
> > > > ----- new -----
> > > >
> > > > KUNIT_ASSERT_NOT_ERR_OR_NULL(test, of_root);
> > > >
> > > > (2) in of_unittest_property_string()
> > > >
> > > > ----- old -----
> > > >
> > > > /* of_property_read_string_index() tests */
> > > > rc = of_property_read_string_index(np, "string-property", 0,
> > > > strings);
> > > > unittest(rc == 0 && !strcmp(strings[0], "foobar"),
> > > > "of_property_read_string_index() failure; rc=%i\n", rc);
> > > >
> > > > ----- new -----
> > > >
> > > > /* of_property_read_string_index() tests */
> > > > rc = of_property_read_string_index(np, "string-property", 0,
> > > > strings);
> > > > KUNIT_ASSERT_EQ(test, rc, 0);
> > > > KUNIT_EXPECT_STREQ(test, strings[0], "foobar");
> > > >
> > > >
> > > > If a test fails, that is no reason to abort testing. The remainder of
> > > > the unit
> > > > tests can still run. There may be cascading failures, but that is ok.
> > >
> > > Sure, that's what I am trying to do. I don't see how (1) changes
> > > anything, a failed KUNIT_ASSERT_* only bails on the current test case,
> > > it does not quit the entire test suite let alone crash the kernel.
> >
> > This may be another case of whether a kunit_module is approximately a
> > single KUNIT_EXPECT_*() or a larger number of them.
> >
> > I still want, for example, of_unittest_property_string() to include a large
> > number of KUNIT_EXPECT_*() instances. In that case I still want the rest of
> > the tests in the kunit_module to be executed even after a KUNIT_ASSERT_*()
> > fails. The existing test code has that property.
>
> Sure, in the context of the reply you just sent me on the DT unittest
> thread, that makes sense. I can pull out all but the ones that would
> have terminated the collection of test cases (where you return early),
> if that makes it better.