Re: Q: why didn't GCC warn about this uninitialized variable? (was: Re: [PATCH] perf tests: initialize sa.sa_flags)

From: Ingo Molnar
Date: Thu Mar 03 2016 - 08:24:43 EST



* Jakub Jelinek <jakub@xxxxxxxxxx> wrote:

> On Thu, Mar 03, 2016 at 01:19:44PM +0100, Ingo Molnar wrote:
> > struct sigaction sa;
> >
> > ...
> >
> > sigfillset(&sa.sa_mask);
> > sa.sa_sigaction = segfault_handler;
> > sigaction(SIGSEGV, &sa, NULL);
> >
> > ... which uninitialized sa.sa_flags field GCC merrily accepted as proper C code,
> > despite us turning on essentially _all_ GCC warnings for the perf build that exist
> > under the sun:
>
> GCC -W{,maybe-}uninitialized warning works only on SSA form, so in order to
> get that warning, either it needs to be some scalar that is uninitialized,
> or Scalar Replacement of Aggregates needs to be able to turn the structure
> into independent scalars. Neither is the case here, as you take address of
> the struct when passing its address to sigaction, and that call can't be
> inlined nor is in any other way visible to the compiler, so that it could
> optimize both the caller and sigaction itself at the same time.
>
> Even if GCC added infrastructure for tracking which bits/bytes in
> aggregates are or might be uninitialized at which point, generally,
> struct XYZ abc;
> foo (&abc);
> is so common pattern that warning here that the fields are uninitialized
> would have extremely high false positive ratio.

So at least for the kernel, people rely on external tools that do something like
this anyway, and which are essentially annotated manually that duplicates much of
the effort it would take to make a simple GCC solution work.

So in the aggregate, we already have this overhead _anyway_, except that:

- some of the best tools are closed (so the techniques never enter the free
software world)

- the fashion we get the feedback is per tool and inefficient

- that there's also an inevitable lag between code added upstream and tools
finding uninitialized variables bugs.

So it's all highly inefficient and fragile.

There's also another cost, the cost of finding the bugs themselves - for example
here's a recent upstream kernel fix:

commit e01d8718de4170373cd7fbf5cf6f9cb61cebb1e9
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Wed Jan 27 23:24:29 2016 +0100

perf/x86: Fix uninitialized value usage

When calling intel_alt_er() with .idx != EXTRA_REG_RSP_* we will not
initialize alt_idx and then use this uninitialized value to index an
array.

When that is not fatal, it can result in an infinite loop in its
caller __intel_shared_reg_get_constraints(), with IRQs disabled.

Alternative error modes are random memory corruption due to the
cpuc->shared_regs->regs[] array overrun, which manifest in either
get_constraints or put_constraints doing weird stuff.

Only took 6 hours of painful debugging to find this. Neither GCC nor
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Smatch warnings flagged this bug.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1960,7 +1960,8 @@ intel_bts_constraints(struct perf_event *event)

static int intel_alt_er(int idx, u64 config)
{
- int alt_idx;
+ int alt_idx = idx;
+
if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
return idx;

6 hours of PeterZ time translates to quite a bit of code restructuring overhead to
eliminate false positive warnings...

So it would scale far better if we could do this kind of checking and annotation
in the kernel code, C module by C module, interface by interface. We could also
push the detection to the stage where such bugs are introduced: when new code is
written - which scales a lot better than the current method of a handful of people
looking at static analysis tools.

If GCC could warn in some really simplistic fashion (accepting tons of false
positives), I'd definitely try to wade through the warnings, eliminate them step
by step and make it all work for a couple of key subsystems I maintain. Most
on-stack structures in the kernel are small, so there's very little reason to be
overly clever with not initializing them.

Thanks,

Ingo