Re: futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op

From: Michael Kerrisk (man-pages)
Date: Thu Jun 23 2016 - 06:53:08 EST


On 06/23/2016 09:18 AM, Thomas Gleixner wrote:
On Wed, 22 Jun 2016, Darren Hart wrote:
However, I don't think the patch below is correct. The existing logic
determines the type of timeout based on the futex_op when it should instead
determine the type of timeout based on the FUTEX_CLOCK_REALTIME flag.

No.

My reading of the man page is that FUTEX_WAIT_BITSET abides by the timeout
interpretation defined by the FUTEX_CLOCK_REALTIME attribute, so
SYSCALL_DEFINE6 was misbehaving for FUTEX_WAIT|FUTEX_CLOCK_REALTIME (where the
timeout should have been treated as absolute) as well as for
FUTEX_WAIT_BITSET|FUTEX_CLOCK_MONOTONIC (where the timeout should have been
treated as relative).

Consider the following:

diff --git a/kernel/futex.c b/kernel/futex.c
index 33664f7..fa2af29 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3230,7 +3230,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
return -EINVAL;

t = timespec_to_ktime(ts);
- if (cmd == FUTEX_WAIT)
+ if (!(cmd & FUTEX_CLOCK_REALTIME))
t = ktime_add_safe(ktime_get(), t);

That breaks LOCK_PI, REQUEUE_PI and FUTEX_WAIT_BITSET

The concern for me is whether the code is incorrect, or if the man page is
incorrect. Does existing userspace code expect the FUTEX_WAIT_BITSET op to
always use an absolute timeout, regardless of the CLOCK used?

FUTEX_WAIT_BITSET, LOCK_PI and REQUEUE_PI always expect absolute time in
CLOCK_REALTIME independent of the CLOCK_REALTIME flag.

Once upon a time, you told me the following:

On 15 May 2014 at 16:14, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
And that universe would love to have your documentation of
FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),

I give you almost the full treatment, but I leave REQUEUE_PI to Darren
and FUTEX_WAKE_OP to Jakub. :)
[...]
FUTEX_CLOCK_REALTIME

This option bit can be ored on the futex ops FUTEX_WAIT_BITSET
and FUTEX_WAIT_REQUEUE_PI

If set the kernel treats the user space supplied timeout as
absolute time based on CLOCK_REALTIME.

If not set the kernel treats the user space supplied timeout
as relative time.

Unfortunately, I should have checked the code more carefully...

Looking more carefully at the code, I see understand the situation
is the following:

FUTEX_LOCK_PI
Always uses CLOCK_REALTIME
'timeout' is absolute

FUTEX_WAIT_REQUEUE_PI
Choice of clock (CLOCK_REALTIME vs CLOCK_MONOTONIC) is
determined by presence or absence of
FUTEX_CLOCK_REALTIME flag
'timeout' is absolute

FUTEX_WAIT_BITSET
Choice of clock (CLOCK_REALTIME vs CLOCK_MONOTONIC) is
determined by presence or absence of
FUTEX_CLOCK_REALTIME flag
'timeout' is absolute

FUTEX_WAIT
Choice of clock (CLOCK_REALTIME vs CLOCK_MONOTONIC) is
determined by presence or absence of
FUTEX_CLOCK_REALTIME flag
'timeout' is relative

Right?

I've amended the man page to describe those details.

The flag was explicitely added to allow FUTEX_WAIT to hand in absolute time.

When you say that the "flag was added", which flag do you mean? Or, did you mean:
"applying Matthieu's patch will allow FUTEX_WAIT to hand in absolute time".

diff --git a/kernel/futex.c b/kernel/futex.c
index 33664f7..4bee915 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3230,7 +3230,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
return -EINVAL;

t = timespec_to_ktime(ts);
- if (cmd == FUTEX_WAIT)
+ if (cmd == FUTEX_WAIT && !(op & FUTEX_CLOCK_REALTIME))
t = ktime_add_safe(ktime_get(), t);
tp = &t;
}

So this patch is correct and if the man page is unclear about it then we need
to fix that.

So, my fixes to the man page just now are at
http://git.kernel.org/cgit/docs/man-pages/man-pages.git/commit/?id=8064bfa5369c6856f606004d02e48ab275e05bed

If Matthieu's patch is applied, obviously a further fix will
be needed needed in the description of FUTEX_WAIT.

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/