Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

From: Thomas Gleixner
Date: Tue Jul 23 2019 - 18:59:28 EST


On Tue, 23 Jul 2019, Kees Cook wrote:
> On Mon, Jul 22, 2019 at 04:47:36PM -0700, Andy Lutomirski wrote:
> > I don't love this whole concept, but I also don't have a better idea.
>
> How about we revert the vDSO change? :P

Sigh. Add more special case code to the VDSO again?

> I keep coming back to using the vDSO return address as an indicator.
> Most vDSO calls don't make syscalls, yes? So they're normally
> unfilterable by seccomp.
>
> What was the prior vDSO behavior?

The behaviour is pretty much the same as before:

If the requested clock id is supported by the VDSO and the platform has a
VDSO capable clocksource, everything is handled in user space.

If either of the conditions is false, fall back to a syscall.

The implementation detail changed for 32bit (native and compat):

. The original VDSO used sys_clock_gettime() as fallback, the new one uses
sys_clock_gettime64().

The reason is that we need to support 2038 safe vdso_clock_gettime64() for
32bit which requires to use sys_clock_gettime64() as fallback. So we use
the same fallback for the non 2038 safe vdso_clock_gettime() variant as
well to avoid having different implementations of the fallback code.

And as we have sys_clock_gettime64() exposed for 32bit anyway you need to
deal with that in seccomp independently of the VDSO. It does not make sense
to treat sys_clock_gettime() differently than sys_clock_gettime64(). They
both expose the same information, but the latter is y2038 safe.

So changing vdso back to the original fallback for 32bit (native and
compat) is just a temporary bandaid as seccomp needs to deal with the y2038
safe variant anyway.

Thanks,

tglx