Re: [PATCH 02/14] add blockconsole version 1.1

From: Andrew Morton
Date: Wed May 22 2013 - 18:43:38 EST


On Wed, 22 May 2013 17:04:16 -0400 J__rn Engel <joern@xxxxxxxxx> wrote:

> > > Documentation/block/blockconsole.txt | 94 ++++
> > > Documentation/block/blockconsole/bcon_tail | 62 +++
> > > Documentation/block/blockconsole/mkblockconsole | 29 ++
> >
> > We really need somewhere better to put userspace tools.
>
> Agreed. It started as "use this horrible code as a bad example" and
> turned into a repository for userspace code, which it should have
> been.
>
> Do you have an opinion on tools/ vs. a standalong git tree for the
> tools?

./tools/blockconsole/ sounds nice. I like maintaining simple userspace
utils in the main kernel tree - it's very easy and efficient for us.

> > > +#define BCON_MAX_ERRORS 10
> > > +static void bcon_end_io(struct bio *bio, int err)
> > > +{
> > > + struct bcon_bio *bcon_bio = container_of(bio, struct bcon_bio, bio);
> > > + struct blockconsole *bc = bio->bi_private;
> > > + unsigned long flags;
> > > +
> > > + /*
> > > + * We want to assume the device broken and free this console if
> > > + * we accumulate too many errors. But if errors are transient,
> > > + * we also want to forget about them once writes succeed again.
> > > + * Oh, and we only want to reset the counter if it hasn't reached
> > > + * the limit yet, so we don't bcon_put() twice from here.
> > > + */
> > > + spin_lock_irqsave(&bc->end_io_lock, flags);
> > > + if (err) {
> > > + if (bc->error_count++ == BCON_MAX_ERRORS) {
> > > + pr_info("no longer logging to %s\n", bc->devname);
> >
> > pr_info() may be too low a severity level?
>
> There is no distinction between "someone pulled the usb device" and
> "broken cable". Blockconsole will continue writing and retire the
> device if errors accumulate. The code is deliberately stupid, because
> stupid code tends to be more robust and devoid of strange
> corner-cases.

I meant should we use pr_err() or pr_emerg() rather than pr_info().
Chances are that pr_info() is being hidden.

> > How does the code handle the obvious recursion concern here?
>
> I think your next comment answers that.

hm, that was clever of me. I meant more generally, if we do a printk()
from within this code how do we prevent arbitrarily deep recursion?

> > > + memcpy(bc->bio_array[i].sector +
> > > + __bcon_console_ofs(console_bytes), msg, n);
> > > + len -= n;
> > > + msg += n;
> > > + bcon_advance_console_bytes(bc, n);
> > > + }
> > > + wake_up_process(bc->writeback_thread);
> > > + mod_timer(&bc->pad_timer, jiffies + HZ);
> > > +}
> > > +
> > > +static void bcon_init_bios(struct blockconsole *bc)
> >
> > This code really needs some comments :(
> >
> > Oh I see. It's BIOs, not BIOS. Had me fooled there.
>
> But, but, but isn't this obvious to someone who has just written said
> function? It has been a year now, so my own definition of obvious may
> have changed as well.

heh. I blame Jens for calling it a "bio".

> > > +void bcon_add(const char *name)
> > > +{
> > > + /*
> > > + * We add each name to a small static buffer and ask for a workqueue
> > > + * to go pick it up asap. Once it is picked up, the buffer is empty
> > > + * again, so hopefully it will suffice for all sane users.
> > > + */
> > > + spin_lock(&device_lock);
> > > + if (scanned_devices[0])
> > > + strncat(scanned_devices, ",", sizeof(scanned_devices));
> > > + strncat(scanned_devices, name, sizeof(scanned_devices));
> > > + spin_unlock(&device_lock);
> > > + schedule_work(&bcon_add_work);
> > > +}
> >
> > What's all this do?
>
> Work around init ordering problems. Quite likely there is a much
> nicer way to this that I should know about and don't.

Well, without adequate description of the problem, nobody can help
solve it!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/