Re: arm64 syzbot instances

From: Dmitry Vyukov
Date: Fri Mar 12 2021 - 04:22:26 EST


On Fri, Mar 12, 2021 at 10:16 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
>
> On Fri, Mar 12, 2021 at 9:46 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> > On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > > On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> > > a) accessing a legacy ISA/LPC port should not result in an oops,
> > > but should instead return values with all bits set. There could
> > > be a ratelimited console warning about broken drivers, but we
> > > can't assume that all drivers work correctly, as some ancient
> > > PC style drivers still rely on this.
> > > John Garry has recently worked on a related bugfix, so maybe
> > > either this is the same bug he encountered (and hasn't merged
> > > yet), or if his fix got merged there is still a remaining problem.
>
> > > b) It should not be possible to open /dev/ttyS3 if the device is
> > > not initialized. What is the output of 'cat /proc/tty/driver/serial'
> > > on this machine? Do you see any messages from the serial
> > > driver in the boot log?
> > > Unfortunately there are so many different ways to probe devices
> > > in the 8250 driver that I don't know where this comes from.
> > > Your config file has
> > > CONFIG_SERIAL_8250_PNP=y
> > > CONFIG_SERIAL_8250_NR_UARTS=32
> > > CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> > > CONFIG_SERIAL_8250_EXTENDED=y
> > > I guess it's probably the preconfigured uarts that somehow
> > > become probed without initialization, but it could also be
> > > an explicit device incorrectly described by qemu.
> >
> >
> > Here is fool boot log, /proc/tty/driver/serial and the crash:
> > https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt
>
> Ok, so there are four 8250 ports, and none of them are initialized,
> while the console is on /dev/ttyAMA0 using a different driver.
>
> I'm fairly sure this is a bug in the kernel then, not in qemu.
>
>
> I also see that the PCI I/O space gets mapped to a physical address:
> [ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
> 0x003eff0000..0x003effffff -> 0x0000000000
>
> So it's probably qemu that triggers the 'synchronous external
> abort' when accessing the PCI I/O space, which in turn hints
> towards a bug in qemu. Presumably it only returns data from
> I/O ports that are actually mapped to a device when real hardware
> is supposed to return 0xffffffff when reading from unused I/O ports.
> This would be separate from the work that John did, which only
> fixed the kernel for accessing I/O port ranges that do not have
> a corresponding MMU mapping to hardware ports.

Will John's patch fix this crash w/o any changes in qemu? That would
be good enough for syzbot. Otherwise we need to report the issue to
qemu.