On Wed, Jun 14, 2023 at 10:55:12AM +0800, Hongyu Xie wrote:Nobody review v4 for over a week.
It was inspired by kgdboc.
This is a debug module that allows you to get all kernel logs
after panic.
Normally you need to attach a USB-to-UART tool or enable kdump
before panic happens to get log from kernel after panic. If you
didn't do that and kdump is not working, you can't get any log to
know what happened before panic. If you have a USB-to-UART tool
and the UART port on your computer is working. This module helps
you to get all kernel log after panic() is called.
To use this, see Documentation/dev-tools/panic_serial_helper.rst.
Tested on an arm64 device.
Signed-off-by: Hongyu Xie <xiehongyu1@xxxxxxxxxx>
---
Why is this a RESEND? What's wrong with the previous version?
What is wrong with MODULE_NAME? Over 80 module file use MODULE_NAME.
v4: fix some syntax problems in Documentation.
v3: fix problems in Documentation reported by
kernel test robot <lkp@xxxxxxxxx>.
v2: replace uart with UART for consistency.
Documentation/dev-tools/index.rst | 1 +
.../dev-tools/panic_serial_helper.rst | 138 ++++
MAINTAINERS | 5 +
drivers/tty/serial/Kconfig | 25 +
drivers/tty/serial/Makefile | 1 +
drivers/tty/serial/panic_serial_helper.c | 619 ++++++++++++++++++
include/linux/panic.h | 1 +
kernel/panic.c | 12 +
8 files changed, 802 insertions(+)
create mode 100644 Documentation/dev-tools/panic_serial_helper.rst
create mode 100644 drivers/tty/serial/panic_serial_helper.c
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 6b0663075dc0..0655528e5a83 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -34,6 +34,7 @@ Documentation/dev-tools/testing-overview.rst
kselftest
kunit/index
ktap
+ panic_serial_helper
.. only:: subproject and html
diff --git a/Documentation/dev-tools/panic_serial_helper.rst b/Documentation/dev-tools/panic_serial_helper.rst
new file mode 100644
index 000000000000..d3c177f56bc4
--- /dev/null
+++ b/Documentation/dev-tools/panic_serial_helper.rst
@@ -0,0 +1,138 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========================================================
+Using panic serial helper to get kernel logs after panic
+========================================================
+
+:Author: Hongyu Xie <xiehongyu1@xxxxxxxxxx>
+
+What is this?
+=============
+
+A debug module inspired by kgdboc that allows you to get all kernel logs
+after panic.
+
+Why do you need it?
+===================
+
+There are many debugging methods to know what was going on before panic.
+
+The firs is Kdump. When it is enabled, you can get a core image after
+panic. Then use GDB or Crash to debug that core image to know what happened
+before panic (see ``Documentation/admin-guide/kdump/kdump.rst`` for Kdump
+documentation).
+
+Another way is to connect the UART side of a USB-to-UART tool to the
+debugging UART port (normally a 3 pin slot on the motherborad or a RS232
+port on the back panel of your PC) before panic happens. Then connect the
+USB side of a USB-to-UART tool to another PC. You can read all the kernel
+logs coming from that UART port through apps like minicom on another PC.
+So when panic happens you'll know what was going on.
+
+What if Kdump hasn't been enabled? And people don't always connect an
+USB-to-UART device while he/she is using the PC. When panic occurs, it's
+too late to connect the USB-to-UART device.
+
+For both situations, you can use panic_serial_helper module to get all
+necessary kernel logs once it is loaded.
+
+How to use it?
+==============
+
+Prerequisites
+--------------
+
+1. Same as kgdboc, the UART driver must implement two callbacks in the
+ struct uart_ops. See Documentation/dev-tools/kgdb.rst for details.
+
+2. Your PC has an UART port and it's working.
+
+How
+--------------
+
+First you need to enable ``CONFIG_PANIC_SERIAL_HELPER`` in your
+config. To enable it go to
+:menuselection:`Device Drivers-->Character devices-->Enable TTY (TTY [=y])-->Serial drivers`
+and select :menuselection:`debug through UART after panic`.
+
+Then build and deploy the kernel as usual.
+
+When the panic occurs, you need to do the following:
+
+1. connect the UART side of an USB-to-UART tool to any UART
+ port on your device (PC, server, Laptop, etc...).
+ Connect the USB side of that tool to another PC. Open
+ minicom (or other app) on that PC, and set "/dev/ttyUSB0"(or
+ "/dev/ttyUSB1 if there is already another USB-to-UART tool
+ connected to your device) with "115200 8N1".
+
+ It automatically selects the port where you first press the
+ "Enter" key (some keyboard labeled this with "Return").
+
+2. Press Enter and the help menu will appear::
+
+ help:
+
+ -a show all kernel msg
+
+ -3 show S3 msg
+
+ -4 show S4 msg
+
+ -filter-[string] show msg containing [string]
+
+ -q- quit
+
+see ``Help menu options`` for details.
+
+3. Select one of above options and happy hacking!
+
+Help menu options
+-----------------
+Available options:
+
+ - a
+
+ Show all the messages starting from ``Booting Linux on ...``
+
+ - 3
+
+ If STR happened before panic, this will show messages starting from
+ ``PM: suspend entry...``
+
+ - 4
+
+ If STD happened before panic, this will show messages starting from
+ ``PM: hibernation entry...``
+
+ - filter-[string]
+
+ Provide case-ignored filter matching. For example, if you'd like to see
+ message lines that contain ``CPU`` or ``cpu``, you can pass either
+ ``filter-CPU`` or ``filter-cpu``. The corresponding output would be
+ like::
+
+ <6>[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x701f6633
+ <6>[ 0.000000] Detected PIPT I-cache on CPU0
+ <6>[ 0.000000] CPU features: detected: Kernel page table isolation (K
+
+ ...
+
+ <6>[ 0.000000] GICv3: CPU0: using allocated LPI pending table @0x0000
+ <6>[ 0.002411] smp: Bringing up secondary CPUs ...
+ <6>[ 0.039105] Detected PIPT I-cache on CPU1
+
+ ...
+
+ <4>[ 6.432129] CPU: 3 PID: 392 Comm: (crub_all) Tainted: G W
+ <4>[ 6.560279] CPU: 2 PID: 478 Comm: (ostnamed) Tainted: G W
+
+ ...
+
+ <4>[ 225.297828] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G W
+ <2>[ 225.297909] SMP: stopping secondary CPUs
+ <0>[ 225.297919] CPU features: 0x000000,02000800,0400421b
+
+ - q
+
+ Return to help menu.
diff --git a/MAINTAINERS b/MAINTAINERS
index 74aeae10a151..73fcae8c2f39 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15972,6 +15972,11 @@ L: platform-driver-x86@xxxxxxxxxxxxxxx
S: Maintained
F: drivers/platform/x86/panasonic-laptop.c
+PANIC SERIAL CONSOLE
+M: Hongyu Xie <xiehongyu1@xxxxxxxxxx>
+F: drivers/tty/serial/panic_serial_helper.c
+F: drivers/tty/serial/panic_serial_helper.h
+
PARALLAX PING IIO SENSOR DRIVER
M: Andreas Klinger <ak@xxxxxxxxxxxxx>
L: linux-iio@xxxxxxxxxxxxxxx
diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index 3e3fb377d90d..796441b58498 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -198,6 +198,31 @@ config SERIAL_KGDB_NMI
If unsure, say N.
+config PANIC_SERIAL_HELPER
+ tristate "debug through UART after panic"
+ depends on PANIC_TIMEOUT=0
+ select CONSOLE_POLL
+ help
+ This is a debug module that allows you to get all kernel logs
+ after panic.
+
+ Normally you need to attach a USB-to-UART tool or enable kdump
+ before panic happens to get log from kernel after panic. If you
+ didn't do that and kdump is not working, you can't get any log to
+ know what happened before panic. If you have a USB-to-UART tool
+ and the UART port on your computer is working, this module helps
+ you to get all kernel log after panic() is called.
+
+ This module uses serial port in poll mode, so it's more stable
+ than other debugging methods.
+
+ Read <file:Documentation/dev-tools/panic_serial_helper.rst> for
+ usage.
+
+ Say Y if you have a working UART port and you want to gather
+ kernel logs. To compile this as module (which will be called
+ panic_serial_helper), say M. If unsure, say N.
+
config SERIAL_MESON
tristate "Meson serial port support"
depends on ARCH_MESON || COMPILE_TEST
diff --git a/drivers/tty/serial/Makefile b/drivers/tty/serial/Makefile
index 531ec3a19dae..d7f6fdc8913c 100644
--- a/drivers/tty/serial/Makefile
+++ b/drivers/tty/serial/Makefile
@@ -93,3 +93,4 @@ obj-$(CONFIG_SERIAL_MCTRL_GPIO) += serial_mctrl_gpio.o
obj-$(CONFIG_SERIAL_KGDB_NMI) += kgdb_nmi.o
obj-$(CONFIG_KGDB_SERIAL_CONSOLE) += kgdboc.o
+obj-$(CONFIG_PANIC_SERIAL_HELPER) += panic_serial_helper.o
diff --git a/drivers/tty/serial/panic_serial_helper.c b/drivers/tty/serial/panic_serial_helper.c
new file mode 100644
index 000000000000..1d3d57bc6340
--- /dev/null
+++ b/drivers/tty/serial/panic_serial_helper.c
@@ -0,0 +1,619 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * panic_serial_helper.c Debug through UART when panic.
+ *
+ * Copyright (C) 2023 Xie Hongyu <xiehongyu1@xxxxxxxxxx>
+ *
+ * Inspired by kgdboc.
+ *
+ */
+
+#define MODULE_NAME "panic_serial_helper"
+#define pr_fmt(fmt) MODULE_NAME ": " fmt
KBUILD_NAME is all you need, right?
It's shorter in a line.
+
+#include <linux/kmsg_dump.h>
+#include <linux/bsearch.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/tty_driver.h>
+#include <linux/serial_core.h>
+#include <linux/string.h>
+
+#define S3_ENTRY "PM: suspend entry"
+#define S3_EXIT "PM: suspend exit"
+#define S4_ENTRY "PM: hibernation entry"
+#define S4_EXIT "PM: hibernation exit"
Why are these needed? Why not just use strings where they are used as
you are only using these once.
Let's say n is the number of msg lines from while(kmsg_dump_get_line).
+
+/* list to store msg lines */
+static LIST_HEAD(psh_list);
That's going to get HUGE, right? How much memory does this module take
up?
bool kmsg_dump_get_line(struct kmsg_dump_iter *iter, bool syslog, char *line, size_t size, size_t *len).
+
+/* msg line prototype */
+struct dmesg_lines {
+ struct list_head entry;
+ char *buf;
+ int size;
size of what?
Indicates the status of this module.
+};
+
+/* panic serial helper status*/
Odd formatting :(
+enum PSHS {
+ PSHS_INIT,
+ PSHS_WAIT_HELP_INPUT,
+};
What do these mean?
To choose the msg type you need.
+
+/* panic serial helper msg type */
+enum PSHM_TYPE {
+ PSHM_TYPE_ALL,
+ PSHM_TYPE_S3,
+ PSHM_TYPE_S4,
+ PSHM_TYPE_STRINGS,
+ PSHM_TYPE_QUIT,
Why do you have message types?
pstore infrastructure needs a backend, ram, efi, mtd, etc. Which needs a _panic_write implementation. Some of the implementation works in a interrupt context. What if panic happens in a none interrupt context? And what if _panic_write fails? This module use UART in poll mode and it's reliable.
But most importantly, why all of this at all? Why not just tie into the
pstore infrastructure? Wouldn't that handle the majority of this for
you?
tty_find_polling_driver use the name to find a tty driver. What's wrong with it? kgdboc does the same thing.
+};
+
+/* whether UART is dumping msg */
+static bool dumping_msg;
+
+/* to filter msg */
+static char filter[256] = {0};
+
+struct psh_buf {
+#define PSH_BUF_SIZE 256
+ char buf[PSH_BUF_SIZE];
+ int cur;
+};
+
+static const char psh_tty_types[][32] = {
+ {"ttyAMA"},
+ {"ttyS"},
+ {"ttyPS"},
+ {"ttyLP"},
+ {"ttyARC"},
+ {"ttyAL"},
+ {"ttyUL"},
Sorry, but hard-coded tty port names are not going to go well. Why did
you pick this tiny subset of valid tty names?
For convenience. So you don't have to try every baud rate when it happens.
+};
+
+#define TTY_OPS "115200n8n"
Why this default?
There are two, one for '\b', one for '\r'. And maybe more in the future.
+
+struct psh_serial_dev {
+ struct list_head entry;
+ struct tty_driver *drv;
+ struct psh_buf buf;
+ enum PSHS psh_status;
+ enum PSHM_TYPE psh_msg_type;
+ int line;
+};
+
+struct psh_serial_dev *psh_dev;
+
+/* char handler prototype */
+struct c_handler {
+ char c;
+ int (*handler)(struct psh_serial_dev *dev, void *d);
Why do you need a handler when you only have one handler?
Like I said above.
Anyway, please look at pstore and tie into that if you really want
something like this.
thanks,
thanks,
greg k-h