Re: [PATCH 1/5] include/linux: Add instrumented.h infrastructure

From: Marco Elver
Date: Tue Jan 21 2020 - 11:14:22 EST


On Tue, 21 Jan 2020 at 14:01, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Mon, Jan 20, 2020 at 3:45 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >
> > On Mon, Jan 20, 2020 at 3:19 PM Marco Elver <elver@xxxxxxxxxx> wrote:
> > >
> > > This adds instrumented.h, which provides generic wrappers for memory
> > > access instrumentation that the compiler cannot emit for various
> > > sanitizers. Currently this unifies KASAN and KCSAN instrumentation. In
> > > future this will also include KMSAN instrumentation.
> > >
> > > Note that, copy_{to,from}_user require special instrumentation,
> > > providing hooks before and after the access, since we may need to know
> > > the actual bytes accessed (currently this is relevant for KCSAN, and is
> > > also relevant in future for KMSAN).
> > >
> > > Suggested-by: Arnd Bergmann <arnd@xxxxxxxx>
> > > Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
> > > ---
> > > include/linux/instrumented.h | 153 +++++++++++++++++++++++++++++++++++
> > > 1 file changed, 153 insertions(+)
> > > create mode 100644 include/linux/instrumented.h
> > >
> > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h
> > > new file mode 100644
> > > index 000000000000..9f83c8520223
> > > --- /dev/null
> > > +++ b/include/linux/instrumented.h
> > > @@ -0,0 +1,153 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +
> > > +/*
> > > + * This header provides generic wrappers for memory access instrumentation that
> > > + * the compiler cannot emit for: KASAN, KCSAN.
> > > + */
> > > +#ifndef _LINUX_INSTRUMENTED_H
> > > +#define _LINUX_INSTRUMENTED_H
> > > +
> > > +#include <linux/compiler.h>
> > > +#include <linux/kasan-checks.h>
> > > +#include <linux/kcsan-checks.h>
> > > +#include <linux/types.h>
> > > +
> > > +/**
> > > + * instrument_read - instrument regular read access
> > > + *
> > > + * Instrument a regular read access. The instrumentation should be inserted
> > > + * before the actual read happens.
> > > + *
> > > + * @ptr address of access
> > > + * @size size of access
> > > + */
> >
> > Based on offline discussion, that's what we add for KMSAN:
> >
> > > +static __always_inline void instrument_read(const volatile void *v, size_t size)
> > > +{
> > > + kasan_check_read(v, size);
> > > + kcsan_check_read(v, size);
> >
> > KMSAN: nothing
> >
> > > +}
> > > +
> > > +/**
> > > + * instrument_write - instrument regular write access
> > > + *
> > > + * Instrument a regular write access. The instrumentation should be inserted
> > > + * before the actual write happens.
> > > + *
> > > + * @ptr address of access
> > > + * @size size of access
> > > + */
> > > +static __always_inline void instrument_write(const volatile void *v, size_t size)
> > > +{
> > > + kasan_check_write(v, size);
> > > + kcsan_check_write(v, size);
> >
> > KMSAN: nothing
> >
> > > +}
> > > +
> > > +/**
> > > + * instrument_atomic_read - instrument atomic read access
> > > + *
> > > + * Instrument an atomic read access. The instrumentation should be inserted
> > > + * before the actual read happens.
> > > + *
> > > + * @ptr address of access
> > > + * @size size of access
> > > + */
> > > +static __always_inline void instrument_atomic_read(const volatile void *v, size_t size)
> > > +{
> > > + kasan_check_read(v, size);
> > > + kcsan_check_atomic_read(v, size);
> >
> > KMSAN: nothing
> >
> > > +}
> > > +
> > > +/**
> > > + * instrument_atomic_write - instrument atomic write access
> > > + *
> > > + * Instrument an atomic write access. The instrumentation should be inserted
> > > + * before the actual write happens.
> > > + *
> > > + * @ptr address of access
> > > + * @size size of access
> > > + */
> > > +static __always_inline void instrument_atomic_write(const volatile void *v, size_t size)
> > > +{
> > > + kasan_check_write(v, size);
> > > + kcsan_check_atomic_write(v, size);
> >
> > KMSAN: nothing
> >
> > > +}
> > > +
> > > +/**
> > > + * instrument_copy_to_user_pre - instrument reads of copy_to_user
> > > + *
> > > + * Instrument reads from kernel memory, that are due to copy_to_user (and
> > > + * variants).
> > > + *
> > > + * The instrumentation must be inserted before the accesses. At this point the
> > > + * actual number of bytes accessed is not yet known.
> > > + *
> > > + * @dst destination address
> > > + * @size maximum access size
> > > + */
> > > +static __always_inline void
> > > +instrument_copy_to_user_pre(const volatile void *src, size_t size)
> > > +{
> > > + /* Check before, to warn before potential memory corruption. */
> > > + kasan_check_read(src, size);
> >
> > KMSAN: check that (src,size) is initialized
> >
> > > +}
> > > +
> > > +/**
> > > + * instrument_copy_to_user_post - instrument reads of copy_to_user
> > > + *
> > > + * Instrument reads from kernel memory, that are due to copy_to_user (and
> > > + * variants).
> > > + *
> > > + * The instrumentation must be inserted after the accesses. At this point the
> > > + * actual number of bytes accessed should be known.
> > > + *
> > > + * @dst destination address
> > > + * @size maximum access size
> > > + * @left number of bytes left that were not copied
> > > + */
> > > +static __always_inline void
> > > +instrument_copy_to_user_post(const volatile void *src, size_t size, size_t left)
> > > +{
> > > + /* Check after, to avoid false positive if memory was not accessed. */
> > > + kcsan_check_read(src, size - left);
> >
> > KMSAN: nothing
>
> One detail I noticed for KMSAN is that kmsan_copy_to_user has a
> special case when @to address is in kernel-space (compat syscalls
> doing tricky things), in that case it only copies metadata. We can't
> handle this with existing annotations.
>
>
> * actually copied to ensure there was no information leak. If @to belongs to
> * the kernel space (which is possible for compat syscalls), KMSAN just copies
> * the metadata.
> */
> void kmsan_copy_to_user(const void *to, const void *from, size_t
> to_copy, size_t left);

Sent v2: http://lkml.kernel.org/r/20200121160512.70887-1-elver@xxxxxxxxxx
I hope it'll satisfy our various constraints for now.

Thanks,
-- Marco