Re: [PATCH v3 2/6] exit: Put an upper limit on how often we can oops

From: Seth Jenkins
Date: Thu Jan 19 2023 - 15:19:43 EST


> Do you have a plan to backport this into upstream LTS kernels?

As I understand, the answer is "hopefully yes" with the big
presumption that all stakeholders are on board for the change. There
is *definitely* a plan to *submit* backports to the stable trees, but
ofc it will require some approvals.


On Thu, Jan 19, 2023 at 3:10 PM SeongJae Park <sj@xxxxxxxxxx> wrote:
>
> Hello,
>
> On Thu, 17 Nov 2022 15:43:22 -0800 Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> > From: Jann Horn <jannh@xxxxxxxxxx>
> >
> > Many Linux systems are configured to not panic on oops; but allowing an
> > attacker to oops the system **really** often can make even bugs that look
> > completely unexploitable exploitable (like NULL dereferences and such) if
> > each crash elevates a refcount by one or a lock is taken in read mode, and
> > this causes a counter to eventually overflow.
> >
> > The most interesting counters for this are 32 bits wide (like open-coded
> > refcounts that don't use refcount_t). (The ldsem reader count on 32-bit
> > platforms is just 16 bits, but probably nobody cares about 32-bit platforms
> > that much nowadays.)
> >
> > So let's panic the system if the kernel is constantly oopsing.
> >
> > The speed of oopsing 2^32 times probably depends on several factors, like
> > how long the stack trace is and which unwinder you're using; an empirically
> > important one is whether your console is showing a graphical environment or
> > a text console that oopses will be printed to.
> > In a quick single-threaded benchmark, it looks like oopsing in a vfork()
> > child with a very short stack trace only takes ~510 microseconds per run
> > when a graphical console is active; but switching to a text console that
> > oopses are printed to slows it down around 87x, to ~45 milliseconds per
> > run.
> > (Adding more threads makes this faster, but the actual oops printing
> > happens under &die_lock on x86, so you can maybe speed this up by a factor
> > of around 2 and then any further improvement gets eaten up by lock
> > contention.)
> >
> > It looks like it would take around 8-12 days to overflow a 32-bit counter
> > with repeated oopsing on a multi-core X86 system running a graphical
> > environment; both me (in an X86 VM) and Seth (with a distro kernel on
> > normal hardware in a standard configuration) got numbers in that ballpark.
> >
> > 12 days aren't *that* short on a desktop system, and you'd likely need much
> > longer on a typical server system (assuming that people don't run graphical
> > desktop environments on their servers), and this is a *very* noisy and
> > violent approach to exploiting the kernel; and it also seems to take orders
> > of magnitude longer on some machines, probably because stuff like EFI
> > pstore will slow it down a ton if that's active.
>
> I found a blog article[1] recommending LTS kernels to backport this as below.
>
> While this patch is already upstream, it is important that distributed
> kernels also inherit this oops limit and backport it to LTS releases if we
> want to avoid treating such null-dereference bugs as full-fledged security
> issues in the future.
>
> Do you have a plan to backport this into upstream LTS kernels?
>
> [1] https://googleprojectzero.blogspot.com/2023/01/exploiting-null-dereferences-in-linux.html
>
>
> Thanks,
> SJ
>
> >
> > Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
> > Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@xxxxxxxxxx
> > Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>