Re: [RFC PATCH] UML: add support for KASAN under x86_64

From: Patricia Alfonso
Date: Thu Feb 06 2020 - 13:33:24 EST


On Fri, Jan 17, 2020 at 2:05 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Fri, Jan 17, 2020 at 11:03 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >
> > On Fri, Jan 17, 2020 at 10:59 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Jan 16, 2020 at 10:39 PM Patricia Alfonso
> > > <trishalfonso@xxxxxxxxxx> wrote:
> > > >
> > > > On Thu, Jan 16, 2020 at 1:23 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Jan 16, 2020 at 10:20 AM Johannes Berg
> > > > > <johannes@xxxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Thu, 2020-01-16 at 10:18 +0100, Dmitry Vyukov wrote:
> > > > > > >
> > > > > > > This should resolve the problem with constructors (after they
> > > > > > > initialize KASAN, they can proceed to do anything they need) and it
> > > > > > > should get rid of most KASAN_SANITIZE (in particular, all of
> > > > > > > lib/Makefile and kernel/Makefile) and should fix stack instrumentation
> > > > > > > (in case it does not work now). The only tiny bit we should not
> > > > > > > instrument is the path from constructor up to mmap call.
> > > >

By initializing KASAN as the first thing that executes, I have been
able to get rid of most of the "KASAN_SANITIZE := n" lines and I am
very happy about that. Thanks for the suggestions!

> > > If that part of the code I mentioned is instrumented, manifestation
> > > would be different -- stack instrumentation will try to access shadow,
> > > shadow is not mapped yet, so it would crash on the shadow access.
> > >
> > > What you are seeing looks like, well, a kernel bug where it does a bad
> > > stack access. Maybe it's KASAN actually _working_? :)
> >
> > Though, stack instrumentation may have issues with longjmp-like things.
> > I would suggest first turning off stack instrumentation and getting
> > that work. Solving problems one-by-one is always easier.
> > If you need help debugging this, please post more info: patch, what
> > you are doing, full kernel output (preferably from start, if it's not
> > too lengthy).
>
> I see syscall_stub_data does some weird things with stack (stack
> copy?). Maybe we just need to ignore accesses there: individual
> accesses, or whole function/file.

It is still not clear whether the syscall_stub_data errors are false
positives, but while moving the kasan_init() to be as early as
possible in main(), I ran into a few more stack-related errors like
this(show_stack, dump_trace, and get_wchan). I will be taking your
advice to focus on one thing at a time and temporarily disable stack
instrumentation wherever possible.

--
Patricia Alfonso