Re: [PATCH 2/2] printk: use the lockless ringbuffer

From: John Ogness
Date: Thu Feb 13 2020 - 04:42:42 EST


On 2020-02-13, Sergey Senozhatsky <sergey.senozhatsky.work@xxxxxxxxx> wrote:
>> - while (user->seq == log_next_seq) {
>> + if (!prb_read_valid(prb, user->seq, r)) {
>> if (file->f_flags & O_NONBLOCK) {
>> ret = -EAGAIN;
>> logbuf_unlock_irq();
>> @@ -890,30 +758,26 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
>>
>> logbuf_unlock_irq();
>> ret = wait_event_interruptible(log_wait,
>> - user->seq != log_next_seq);
>> + prb_read_valid(prb, user->seq, r));
>> if (ret)
>> goto out;
>> logbuf_lock_irq();
>> }
>>
>> - if (user->seq < log_first_seq) {
>> - /* our last seen message is gone, return error and reset */
>> - user->idx = log_first_idx;
>> - user->seq = log_first_seq;
>> + if (user->seq < r->info->seq) {
>> + /* the expected message is gone, return error and reset */
>> + user->seq = r->info->seq;
>> ret = -EPIPE;
>> logbuf_unlock_irq();
>> goto out;
>> }
>
> Sorry, why doesn't this do something like
>
> if (user->seq < prb_first_seq(prb)) {
> /* the expected message is gone, return error and reset */
> user->seq = prb_first_seq(prb);
> ret = -EPIPE;
> ...
> }

Here prb_read_valid() was successful, so a record _was_ read. The
kerneldoc for the prb_read_valid() says:

* On success, the reader must check r->info.seq to see which record was
* actually read.

The value will either be the requested user->seq or some higher value
because user->seq is not available.

There are 2 reasons why user->seq is not available (and a later record
_is_ available):

1. The ringbuffer overtook user->seq. In this case, comparing and then
setting using prb_first_seq() could be appropriate. And r->info->seq
might even already be what prb_first_seq() would return. (More on
this below.)

2. The record with user->seq has no data because the writer failed to
allocate dataring space. In this case, resetting back to
prb_first_seq() would be incorrect. And since r->info->seq is the
next valid record, it is appropriate that the next devkmsg_read()
starts there.

Rather than checking these cases separately, it is enough just to check
for the 2nd case. For the 1st case, prb_first_seq() could be less than
r->info->seq if all the preceeding records have no data. But this just
means the whole set of records with missing data are skipped, which
matches existing behavior. (For example, currently when devkmsg is
behind 10 messages, there are not 10 -EPIPE returns. Instead it
immediately catches up to the next available record.)

Perhaps the new comment should be:

/*
* The expected message is gone, return error and
* reset to the next available message.
*/

John Ogness