Re: [PATCH] poll: allow f_op->poll to sleep, take#5

From: Davide Libenzi
Date: Wed Nov 26 2008 - 01:28:14 EST


On Wed, 26 Nov 2008, Tejun Heo wrote:

> +static int pollwake(wait_queue_t *wait, unsigned mode, int sync, void *key)
> +{
> + struct poll_wqueues *pwq = wait->private;
> + DECLARE_WAITQUEUE(dummy_wait, pwq->polling_task);
> +
> + /*
> + * Wake up functions have full barrier semantics, no need for
> + * barrier here.
> + */
> + pwq->triggered = 1;
> +
> + /*
> + * Perform the default wake up operation using a dummy
> + * waitqueue.
> + *
> + * TODO: This is hacky but there currently is no interface to
> + * pass in @sync. @sync is scheduled to be removed and once
> + * that happens, wake_up_process() can be used directly.
> + */
> + return default_wake_function(&dummy_wait, mode, sync, key);
> +}
> +int poll_schedule_timeout(struct poll_wqueues *pwq, int state,
> + ktime_t *expires, unsigned long slack)
> +{
> + int rc = -EINTR;
> +
> + set_current_state(state);
> + if (!pwq->triggered)
> + rc = schedule_hrtimeout_range(expires, slack, HRTIMER_MODE_ABS);
> + __set_current_state(TASK_RUNNING);
> +
> + /*
> + * Prepare for the next iteration. ->poll() might not have
> + * enough barrier semantics from the second round as waits are
> + * registered only during the first one. Use set_mb().
> + */
> + set_mb(pwq->triggered, 0);
> +
> + return rc;
> +}
> +EXPORT_SYMBOL(poll_schedule_timeout);

Look, pollwake() does:

w1) WR triggered (1)
w2) WMB
w3) WR task->state (RUNNING)

While poll_schedule_timeout() does:

s1) WR task->state (TASK_INTERRUPTIBLE)
s2) MB
s3) RD triggered
s4) IF0 => RD task->state (if !RUNNING -> sleep)


The only risk is that w3 preceed s1, so that we go to sleep even though a
wakeup has been issued. But if w3 is visible, w1 is visible too, that
means that 'triggered' is visible in s3 (there's a MB in s2). So we skip
the schedule_hrtimeout_range(). So IMO you need no barriers on 'triggered'.
If you feel you need barriers, do you mind explaning a sequence of events
that makes a barrier-free version break?



- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/