Re: [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled

From: Fabio M. De Francesco
Date: Fri Dec 03 2021 - 06:00:39 EST


On Thursday, December 2, 2021 7:35:16 PM CET Linus Torvalds wrote:
> On Thu, Dec 2, 2021 at 7:41 AM Tetsuo Handa
> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > Looking at the backtrace, I see
> > >
> > > n_hdlc_send_frames+0x24b/0x490 drivers/tty/n_hdlc.c:290
> > > tty_wakeup+0xe1/0x120 drivers/tty/tty_io.c:534
> > > __start_tty drivers/tty/tty_io.c:806 [inline]
> > > __start_tty+0xfb/0x130 drivers/tty/tty_io.c:799
> > >
> > > and apparently it's that hdlc line discipline (and
> > > n_hdlc_send_frames() in particular) that is the problem here.
> > >
> > > I think that's where the fix should be.
> >
> > Do you mean that we should change the behavior of n_hdlc_send_frames()
> > rather than trying to make __start_tty() schedulable again?
>
> I wouldn't change n_hdlc_send_frames() itself. It does what it says it does.
>
> But n_hdlc_tty_wakeup() probably shouldn't call it directly. Other tty
> line disciplines don't do that kind of thing - although I only looked
> at a couple. They all seem to just set bits and prepare things. Like a
> wakeup function should do.
>
> So I think n_hdlc_tty_wakeup() should perhaps only do a
> "schedule_work()" or similar to get that n_hdlc_send_frames() started,
> rather than doing it itself.
>
> Example: net/nfc/nci/uart.c. It does that
>
> schedule_work(&nu->write_work);
>
> instead of actually trying to do a write from a wakeup routine
> (similar examples in ppp - "tasklet_schedule(&ap->tsk)" etc).
>
> I mean, it's called "wakeup", not "write". So I think the fundamental
> confusion here is in hdlc, not the tty layer.
>
> Linus
>
This is what I understand from the above argument: do a schedule_work() to get
that n_hdlc_send_frames() started; in this way, n_hdlc_tty_wakeup() can
return to the caller and n_hdlc_send_frames() is executed asynchronously
(i.e., no longer in an atomic context).

I hope that I'm not missing something. If the above summary is correct,
please forgive a newbie for the following questions...

Commit f9e053dcfc02 ("tty: Serialize tty flow control changes with flow_lock")
has introduced spinlocks to serialize flow control changes and avoid the
concurrent executions of __start_tty() and __stop_tty().

This is an excerpt from the above-mentioned commit:

->
Introduce tty->flow_lock spinlock to serialize tty flow control changes.
Split out unlocked __start_tty()/__stop_tty() flavors for use by
ioctl(TCXONC) in follow-on patch.
<-

This is the reason why we are dealing with this bug. Currently we have __start_tty()
called with an acquired spinlock and IRQs disabled and the calls chain leads to
console_lock() while in atomic context.

In summation, my questions are...

1) Why do we still need to protect __start_tty() and __stop_tty() with spin_lock_irq()
if the solution to the bug is to execute n_hdlc_send_frames() asynchronously?

2) If it is true that we need to avoid concurrent executions of __start_tty() and
__stop_tty(), can we just use a Mutex in the IOCTL's helper?

Thanks,

Fabio M. De Francesco