Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

From: Heinrich Schuchardt
Date: Fri Apr 08 2022 - 12:38:39 EST

Next message: Michael Straube: "[PATCH] staging: r8188eu: convert else if to else in rtw_led.c"
Previous message: James Morse: "Re: arm64 spectre-bhb backports break boot on stable kernels <= v5.4"
Next in thread: Anup Patel: "Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 4/6/22 12:10, Anup Patel wrote:

On Wed, Apr 6, 2022 at 3:25 PM Heinrich Schuchardt
<heinrich.schuchardt@xxxxxxxxxxxxx> wrote:

On 3/31/22 21:42, Palmer Dabbelt wrote:

On Sat, 19 Mar 2022 05:12:06 PDT (-0700), apatel@xxxxxxxxxxxxxxxx wrote:

Currently, the range and default value of NR_CPUS is too restrictive
for high-end RISC-V systems with large number of HARTs. The latest
QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
restrictive for QEMU as well. Other major architectures (such as
ARM64, x86_64, MIPS, etc) have a much higher range and default
value of NR_CPUS.

This patch increases NR_CPUS range to 2-512 and default value to
XLEN (i.e. 32 for RV32 and 64 for RV64).

Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx>
---
Changes since v1:
- Updated NR_CPUS range to 2-512 which reflects maximum number of
CPUs supported by QEMU virt machine.
---
arch/riscv/Kconfig | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5adcbd9b5e88..423ac17f598c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -274,10 +274,11 @@ config SMP
If you don't know what to do here, say N.

config NR_CPUS
- int "Maximum number of CPUs (2-32)"
- range 2 32
+ int "Maximum number of CPUs (2-512)"
+ range 2 512

For SBI_V01=y there seems to be a hard constraint to XLEN bits.
See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.

So shouldn't this be something like:

range 2 512 !SBI_V01
range 2 32 SBI_V01 && 32BIT
range 2 64 SBI_V01 && 64BIT

This is just making it unnecessarily complicated for supporting
SBI v0.1

How about removing SBI v0.1 support and the spin-wait CPU
operations from arch/riscv ?

The SBI v0.1 specification was only a draft. Only the v1.0 version has ever been ratified.

It would be good to remove this legacy code from Linux and U-Boot.

By the way, why does upstream OpenSBI claim to be conformant to SBI v0.3 and not to v1.0?

include/sbi/sbi_ecall.h:16:

#define SBI_ECALL_VERSION_MAJOR 0
#define SBI_ECALL_VERSION_MINOR 3

Best regards

Heinrich

depends on SMP
- default "8"
+ default "32" if 32BIT
+ default "64" if 64BIT

config HOTPLUG_CPU
bool "Support for hot-pluggable CPUs"

I'm getting all sorts of boot issues with more than 32 CPUs, even on the
latest QEMU master. I'm not opposed to increasing the CPU count in
theory, but if we're going to have a setting that goes up to a huge
number it needs to at least boot. I've got 64 host threads, so it
shouldn't just be a scheduling thing.

Currently high performing hardware for RISC-V is missing. So it makes
sense to build software via QEMU on x86_64 or arm64 with as many
hardware threads as available (128 is not uncommon).

OpenSBI currently is limited to 128 threads:
include/sbi/sbi_hartmask.h:22:
#define SBI_HARTMASK_MAX_BITS 128
This is just an arbitrary value we can be modified.

Yes, this limit will be gradually increased with some improvements
to optimize runtime memory used by OpenSBI.

U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
more than 16 harts. A patch to correct this is pending:
[PATCH v2 1/1] riscv: alloc space exhausted
https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@xxxxxxxxxxxxxx/T/#t

With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
with 64 virtual cores worked fine for me.

Thanks for trying this patch.

Regards,
Anup

Best regards

Heinrich

If there was some hardware that actually boots on these I'd be happy to
take it, but given that it's just QEMU I'd prefer to sort out the bugs
first. It's probably just latent bugs somewhere, but allowing users to
turn on configs we know don't work just seems like the wrong way to go.

Next message: Michael Straube: "[PATCH] staging: r8188eu: convert else if to else in rtw_led.c"
Previous message: James Morse: "Re: arm64 spectre-bhb backports break boot on stable kernels <= v5.4"
Next in thread: Anup Patel: "Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]