Re: [PATCH v3 2/2] lkdtm: Add Shadow Call Stack tests

From: Dan Li
Date: Fri Mar 04 2022 - 09:34:45 EST




On 3/3/22 10:42, Kees Cook wrote:
On Wed, Mar 02, 2022 at 11:43:39PM -0800, Dan Li wrote:
Add tests for SCS (Shadow Call Stack) based
backward CFI (as implemented by Clang and GCC).

Cool; thanks for writing these!

+lkdtm-$(CONFIG_LKDTM) += scs.o

I'd expect these to be in cfi.c, rather than making a new source file.


Got it.

+static noinline void lkdtm_scs_clear_lr(void)
+{
+ unsigned long *lr = (unsigned long *)__builtin_frame_address(0) + 1;
+
+ asm volatile("str xzr, [%0]\n\t" : : "r"(lr) : "x30");

Is the asm needed here? Why not:

unsigned long *lr = (unsigned long *)__builtin_frame_address(0) + 1;

*lr = 0;


Yeah, with "volatile", this one looks better.

+
+/*
+ * This tries to call a function protected by Shadow Call Stack,
+ * which corrupts its own return address during execution.
+ * Due to the protection, the corruption will not take effect
+ * when the function returns.
+ */
+void lkdtm_CFI_BACKWARD_SHADOW(void)

I think these two tests should be collapsed into a single one.


It seems that there is currently no cross-line matching in
selftests/lkdtm/run.sh, if we put these two into one function and
assume we could make noscs_set_lr _survivable_ (like in your example).

Then we could only match "CFI_BACKWARD_SHADOW ok: scs takes effect."
in texts.txt

But if the test result is:
XPASS: Unexpectedly survived lr corruption without scs?
ok: scs takes effect.

It may not be a real pass, but the xxx_set_lr function doesn't work.

+{
+#ifdef CONFIG_ARM64
+ if (!IS_ENABLED(CONFIG_SHADOW_CALL_STACK)) {
+ pr_err("FAIL: kernel not built with CONFIG_SHADOW_CALL_STACK\n");
+ return;
+ }
+
+ pr_info("Trying to corrupt lr in a function with scs protection ...\n");
+ lkdtm_scs_clear_lr();
+
+ pr_err("ok: scs takes effect.\n");
+#else
+ pr_err("XFAIL: this test is arm64-only\n");
+#endif

This is slightly surprising -- we have no detection when a function has
its non-shadow-stack return address corrupted: it just _ignores_ the
value stored there. That seems like a missed opportunity for warning
about an unexpected state.


Yes.
Actually I used to try in the plugin to add a detection before the function
returns, and call a callback when a mismatch is found. But since almost
every function has to be instrumented, the performance penalty is
improved from <3% to ~20% (rough calculation, should still be optimized).

+}
+
+/*
+ * This tries to call a function not protected by Shadow Call Stack,
+ * which corrupts its own return address during execution.
+ */
+void lkdtm_CFI_BACKWARD_SHADOW_WITH_NOSCS(void)
+{
+#ifdef CONFIG_ARM64
+ if (!IS_ENABLED(CONFIG_SHADOW_CALL_STACK)) {
+ pr_err("FAIL: kernel not built with CONFIG_SHADOW_CALL_STACK\n");
+ return;

Other tests try to give some hints about failures, e.g.:

pr_err("FAIL: cannot change for SCS\n");
pr_expected_config(CONFIG_SHADOW_CALL_STACK);

Though, having the IS_ENABLED in there makes me wonder if this test
should instead be made _survivable_ on failure. Something like this,
completely untested:


#ifdef CONFIG_ARM64
static noinline void lkdtm_scs_set_lr(unsigned long *addr)
{
unsigned long **lr = (unsigned long **)__builtin_frame_address(0) + 1;
*lr = addr;
}

/* Function with __noscs attribute clears its return address. */
static noinline void __noscs lkdtm_noscs_set_lr(unsigned long *addr)
{
unsigned long **lr = (unsigned long **)__builtin_frame_address(0) + 1;
*lr = addr;
}
#endif


void lkdtm_CFI_BACKWARD_SHADOW(void)
{
#ifdef CONFIG_ARM64

/* Verify the "normal" condition of LR corruption working. */
do {
/* Keep label in scope to avoid compiler warning. */
if ((volatile int)0)
goto unexpected;

pr_info("Trying to corrupt lr in a function without scs protection ...\n");
lkdtm_noscs_set_lr(&&expected);

unexpected:
pr_err("XPASS: Unexpectedly survived lr corruption without scs?!\n");
break;

expected:
pr_err("ok: lr corruption redirected without scs.\n");
} while (0);


do {
/* Keep labe in scope to avoid compiler warning. */
if ((volatile int)0)
goto good_scs;

pr_info("Trying to corrupt lr in a function with scs protection ...\n");
lkdtm_scs_set_lr(&&bad_scs);

good_scs:
pr_info("ok: scs takes effect.\n");
break;

bad_scs:
pr_err("FAIL: return address rewritten!\n");
pr_expected_config(CONFIG_SHADOW_CALL_STACK);
} while (0);
#else
pr_err("XFAIL: this test is arm64-only\n");
#endif
}


Thanks for the example, Kees :)
This code (with a little modification) works correctly with clang 12,
but to make sure it's always correct, I think we might need to add the
__attribute__((optnone)) attribute to it, because under -O2 the result
doesn't seem to be "very stable" (as in your example in the next email).

And we should, actually, be able to make the "set_lr" functions be
arch-specific, leaving the test itself arch-agnostic....


I'm not sure if my understanding is correct, do it means we should
remove the "#ifdef CONFIG_ARM64" in lkdtm_CFI_BACKWARD_SHADOW?

Then we may not be able to distinguish between failures caused by
platform unsupported (XFAIL) and features not enabled (or not
working properly).

Thanks,
Dan.