Re: arm64 syzbot instances

From: Arnd Bergmann
Date: Fri Mar 12 2021 - 05:11:55 EST


On Fri, Mar 12, 2021 at 10:21 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Fri, Mar 12, 2021 at 10:16 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
> >
> > On Fri, Mar 12, 2021 at 9:46 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> > > On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > > > On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> > > > a) accessing a legacy ISA/LPC port should not result in an oops,
> > > > but should instead return values with all bits set. There could
> > > > be a ratelimited console warning about broken drivers, but we
> > > > can't assume that all drivers work correctly, as some ancient
> > > > PC style drivers still rely on this.
> > > > John Garry has recently worked on a related bugfix, so maybe
> > > > either this is the same bug he encountered (and hasn't merged
> > > > yet), or if his fix got merged there is still a remaining problem.
> >
> > > > b) It should not be possible to open /dev/ttyS3 if the device is
> > > > not initialized. What is the output of 'cat /proc/tty/driver/serial'
> > > > on this machine? Do you see any messages from the serial
> > > > driver in the boot log?
> > > > Unfortunately there are so many different ways to probe devices
> > > > in the 8250 driver that I don't know where this comes from.
> > > > Your config file has
> > > > CONFIG_SERIAL_8250_PNP=y
> > > > CONFIG_SERIAL_8250_NR_UARTS=32
> > > > CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> > > > CONFIG_SERIAL_8250_EXTENDED=y
> > > > I guess it's probably the preconfigured uarts that somehow
> > > > become probed without initialization, but it could also be
> > > > an explicit device incorrectly described by qemu.
> > >
> > >
> > > Here is fool boot log, /proc/tty/driver/serial and the crash:
> > > https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt
> >
> > Ok, so there are four 8250 ports, and none of them are initialized,
> > while the console is on /dev/ttyAMA0 using a different driver.
> >
> > I'm fairly sure this is a bug in the kernel then, not in qemu.
> >
> >
> > I also see that the PCI I/O space gets mapped to a physical address:
> > [ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
> > 0x003eff0000..0x003effffff -> 0x0000000000
> >
> > So it's probably qemu that triggers the 'synchronous external
> > abort' when accessing the PCI I/O space, which in turn hints
> > towards a bug in qemu. Presumably it only returns data from
> > I/O ports that are actually mapped to a device when real hardware
> > is supposed to return 0xffffffff when reading from unused I/O ports.
> > This would be separate from the work that John did, which only
> > fixed the kernel for accessing I/O port ranges that do not have
> > a corresponding MMU mapping to hardware ports.
>
> Will John's patch fix this crash w/o any changes in qemu? That would
> be good enough for syzbot. Otherwise we need to report the issue to
> qemu.

No, this was a third issue. As far as I remember, this would result in
a similar problem in the case where there is no PCI bus at all, or
where no PCI host has an I/O port range, so the inb() from the serial
driver would cause a page fault. The problem you ran into happens
in qemu when the PCI I/O ports are mapped to hardware registers
that cause an exception when accessed.

If you just want to work around the problem for now, it should
go away if you set CONFIG_SERIAL_8250_RUNTIME_UARTS
to zero.

Arnd