Andrew Jones [Wed, 17 Jan 2024 13:09:34 +0000 (14:09 +0100)]
RISC-V: selftests: cbo: Ensure asm operands match constraints
The 'i' constraint expects a constant operand, which fn and its
constant derivative MK_CBO(fn) are, but passing fn through a function
as a parameter and using a local variable for MK_CBO(fn) allow the
compiler to lose sight of that when no optimization is done. Use
a macro instead of a function and skip the local variable to ensure
the compiler uses constants, matching the asm constraints.
Reported-by: Yunhui Cui <cuiyunhui@bytedance.com> Closes: https://lore.kernel.org/all/20240117082514.42967-1-cuiyunhui@bytedance.com Fixes: a29e2a48afe3 ("RISC-V: selftests: Add CBO tests") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240117130933.57514-2-ajones@ventanamicro.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Tue, 16 Jan 2024 15:14:04 +0000 (07:14 -0800)]
Merge patch series "riscv: support kernel-mode Vector"
Andy Chiu <andy.chiu@sifive.com> says:
This series provides support running Vector in kernel mode.
Additionally, kernel-mode Vector can be configured to run without
turnning off preemption on a CONFIG_PREEMPT kernel. Along with the
suport, we add Vector optimized copy_{to,from}_user. And provide a
simple threshold to decide when to run the vectorized functions.
We decided to drop vectorized memcpy/memset/memmove for the moment due
to the concern of memory side-effect in kernel_vector_begin(). The
detailed description can be found at v9[0]
This series is composed by 4 parts:
patch 1-4: adds basic support for kernel-mode Vector
patch 5: includes vectorized copy_{to,from}_user into the kernel
patch 6: refactor context switch code in fpu [1]
patch 7-10: provides some code refactors and support for preemptible
kernel-mode Vector.
This series can be merged if we feel any part of {1~4, 5, 6, 7~10} is
mature enough.
This patch is tested on a QEMU with V and verified that booting, normal
userspace operations all work as usual with thresholds set to 0. Also,
we test by launching multiple kernel threads which continuously executes
and verifies Vector operations in the background. The module that tests
these operation is expected to be upstream later.
* b4-shazam-merge:
riscv: vector: allow kernel-mode Vector with preemption
riscv: vector: use kmem_cache to manage vector context
riscv: vector: use a mask to write vstate_ctrl
riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}()
riscv: fpu: drop SR_SD bit checking
riscv: lib: vectorize copy_to_user/copy_from_user
riscv: sched: defer restoring Vector context for user
riscv: Add vector extension XOR implementation
riscv: vector: make Vector always available for softirq context
riscv: Add support for kernel mode vector
Andy Chiu [Mon, 15 Jan 2024 05:59:29 +0000 (05:59 +0000)]
riscv: vector: allow kernel-mode Vector with preemption
Add kernel_vstate to keep track of kernel-mode Vector registers when
trap introduced context switch happens. Also, provide riscv_v_flags to
let context save/restore routine track context status. Context tracking
happens whenever the core starts its in-kernel Vector executions. An
active (dirty) kernel task's V contexts will be saved to memory whenever
a trap-introduced context switch happens. Or, when a softirq, which
happens to nest on top of it, uses Vector. Context retoring happens when
the execution transfer back to the original Kernel context where it
first enable preempt_v.
Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an
option to disable preemptible kernel-mode Vector at build time. Users
with constraint memory may want to disable this config as preemptible
kernel-mode Vector needs extra space for tracking of per thread's
kernel-mode V context. Or, users might as well want to disable it if all
kernel-mode Vector code is time sensitive and cannot tolerate context
switch overhead.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-11-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Andy Chiu [Mon, 15 Jan 2024 05:59:28 +0000 (05:59 +0000)]
riscv: vector: use kmem_cache to manage vector context
The allocation size of thread.vstate.datap is always riscv_v_vsize. So
it is possbile to use kmem_cache_* to manage the allocation. This gives
users more information regarding allocation of vector context via
/proc/slabinfo. And it potentially reduces the latency of the first-use
trap because of the allocation caches.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-10-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Andy Chiu [Mon, 15 Jan 2024 05:59:26 +0000 (05:59 +0000)]
riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}()
riscv_v_vstate_{save,restore}() can operate only on the knowlege of
struct __riscv_v_ext_state, and struct pt_regs. Let the caller decides
which should be passed into the function. Meanwhile, the kernel-mode
Vector is going to introduce another vstate, so this also makes functions
potentially able to be reused.
Andy Chiu [Mon, 15 Jan 2024 05:59:25 +0000 (05:59 +0000)]
riscv: fpu: drop SR_SD bit checking
SR_SD summarizes the dirty status of FS/VS/XS. However, the current code
structure does not fully utilize it because each extension specific code
is divided into an individual segment. So remove the SR_SD check for
now.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Song Shuai <songshuaishuai@tinylab.org> Reviewed-by: Guo Ren <guoren@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-7-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Andy Chiu [Mon, 15 Jan 2024 05:59:24 +0000 (05:59 +0000)]
riscv: lib: vectorize copy_to_user/copy_from_user
This patch utilizes Vector to perform copy_to_user/copy_from_user. If
Vector is available and the size of copy is large enough for Vector to
perform better than scalar, then direct the kernel to do Vector copies
for userspace. Though the best programming practice for users is to
reduce the copy, this provides a faster variant when copies are
inevitable.
The optimal size for using Vector, copy_to_user_thres, is only a
heuristic for now. We can add DT parsing if people feel the need of
customizing it.
The exception fixup code of the __asm_vector_usercopy must fallback to
the scalar one because accessing user pages might fault, and must be
sleepable. Current kernel-mode Vector does not allow tasks to be
preemptible, so we must disactivate Vector and perform a scalar fallback
in such case.
The original implementation of Vector operations comes from
https://github.com/sifive/sifive-libc, which we agree to contribute to
Linux kernel.
Co-developed-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Nick Knight <nick.knight@sifive.com> Signed-off-by: Nick Knight <nick.knight@sifive.com> Suggested-by: Guo Ren <guoren@kernel.org> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-6-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Andy Chiu [Mon, 15 Jan 2024 05:59:23 +0000 (05:59 +0000)]
riscv: sched: defer restoring Vector context for user
User will use its Vector registers only after the kernel really returns
to the userspace. So we can delay restoring Vector registers as long as
we are still running in kernel mode. So, add a thread flag to indicates
the need of restoring Vector and do the restore at the last
arch-specific exit-to-user hook. This save the context restoring cost
when we switch over multiple processes that run V in kernel mode. For
example, if the kernel performs a context swicth from A->B->C, and
returns to C's userspace, then there is no need to restore B's
V-register.
Besides, this also prevents us from repeatedly restoring V context when
executing kernel-mode Vector multiple times.
The cost of this is that we must disable preemption and mark vector as
busy during vstate_{save,restore}. Because then the V context will not
get restored back immediately when a trap-causing context switch happens
in the middle of vstate_{save,restore}.
Andy Chiu [Mon, 15 Jan 2024 05:59:21 +0000 (05:59 +0000)]
riscv: vector: make Vector always available for softirq context
The goal of this patch is to provide full support of Vector in kernel
softirq context. So that some of the crypto alogrithms won't need scalar
fallbacks.
By disabling bottom halves in active kernel-mode Vector, softirq will
not be able to nest on top of any kernel-mode Vector. So, softirq
context is able to use Vector whenever it runs.
After this patch, Vector context cannot start with irqs disabled.
Otherwise local_bh_enable() may run in a wrong context.
Disabling bh is not enough for RT-kernel to prevent preeemption. So
we must disable preemption, which also implies disabling bh on RT.
Related-to: commit 696207d4258b ("arm64/sve: Make kernel FPU protection RT friendly") Related-to: commit 66c3ec5a7120 ("arm64: neon: Forbid when irqs are disabled") Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-3-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Thu, 11 Jan 2024 16:04:38 +0000 (08:04 -0800)]
Merge patch series "riscv: mm: Fixup & Optimize COMPAT code"
guoren@kernel.org <guoren@kernel.org> says:
From: Guo Ren <guoren@linux.alibaba.com>
When the task is in COMPAT mode, the TASK_SIZE should be 2GB, so
STACK_TOP_MAX and arch_get_mmap_end must be limited to 2 GB. This series
fixes the problem made by commit: add2cc6b6515 ("RISC-V: mm: Restrict
address space for sv39,sv48,sv57") and optimizes the related coding
convention of TASK_SIZE.
Guo Ren [Fri, 22 Dec 2023 11:57:01 +0000 (06:57 -0500)]
riscv: mm: Fixup compat arch_get_mmap_end
When the task is in COMPAT mode, the arch_get_mmap_end should be 2GB,
not TASK_SIZE_64. The TASK_SIZE has contained is_compat_mode()
detection, so change the definition of STACK_TOP_MAX to TASK_SIZE
directly.
Cc: stable@vger.kernel.org Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231222115703.2404036-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Guo Ren [Fri, 22 Dec 2023 11:57:00 +0000 (06:57 -0500)]
riscv: mm: Fixup compat mode boot failure
In COMPAT mode, the STACK_TOP is DEFAULT_MAP_WINDOW (0x80000000), but
the TASK_SIZE is 0x7fff000. When the user stack is upon 0x7fff000, it
will cause a user segment fault. Sometimes, it would cause boot
failure when the whole rootfs is rv32.
Freeing unused kernel image (initmem) memory: 2236K
Run /sbin/init as init process
Starting init: /sbin/init exists but couldn't execute it (error -14)
Run /etc/init as init process
...
Increase the TASK_SIZE to cover STACK_TOP.
Cc: stable@vger.kernel.org Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231222115703.2404036-2-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The ending NULL is not taken into account by strncat(), so switch to
strlcat() to correctly compute the size of the available memory when
appending CONFIG_CMDLINE to 'early_cmdline'.
Palmer Dabbelt [Thu, 11 Jan 2024 16:02:55 +0000 (08:02 -0800)]
Merge patch series "tools: selftests: riscv: Fix compiler warnings"
Christoph Muellner <christoph.muellner@vrull.eu> says:
From: Christoph Müllner <christoph.muellner@vrull.eu>
When building the RISC-V selftests with a riscv32 compiler I ran into
a couple of compiler warnings. While riscv32 support for these tests is
questionable, the fixes are so trivial that it is probably best to simply
apply them.
Note that the missing-include patch and some format string warnings
are also relevant for riscv64.
* b4-shazam-merge:
tools: selftests: riscv: Fix compile warnings in mm tests
tools: selftests: riscv: Fix compile warnings in vector tests
tools: selftests: riscv: Add missing include for vector test
tools: selftests: riscv: Fix compile warnings in cbo
tools: selftests: riscv: Fix compile warnings in hwprobe
tools: selftests: riscv: Fix compile warnings in mm tests
When building the mm tests with a riscv32 compiler, we see a range
of shift-count-overflow errors from shifting 1UL by more than 32 bits
in do_mmaps(). Since, the relevant code is only called from code that
is gated by `__riscv_xlen == 64`, we can just apply the same gating
to do_mmaps().
tools: selftests: riscv: Fix compile warnings in vector tests
GCC prints a couple of format string warnings when compiling
the vector tests. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.
tools: selftests: riscv: Add missing include for vector test
GCC raises the following warning:
warning: 'status' may be used uninitialized
The warning comes from the fact, that the signature of waitpid() is
unknown and therefore the initialization of GCC cannot be guessed.
Let's add the relevant header to address this warning.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231123185821.2272504-4-christoph.muellner@vrull.eu Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
tools: selftests: riscv: Fix compile warnings in cbo
GCC prints a couple of format string warnings when compiling
the cbo test. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.
tools: selftests: riscv: Fix compile warnings in hwprobe
GCC prints a couple of format string warnings when compiling
the hwprobe test. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.
Alexandre Ghiti [Mon, 8 Jan 2024 19:36:40 +0000 (20:36 +0100)]
riscv: Add support for BATCHED_UNMAP_TLB_FLUSH
Allow to defer the flushing of the TLB when unmapping pages, which allows
to reduce the numbers of IPI and the number of sfence.vma.
The ubenchmarch used in commit 43b3dfdd0455 ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration") that
was multithreaded to force the usage of IPI shows good performance
improvement on all platforms:
* Unmatched: ~34%
* TH1520 : ~78%
* Qemu : ~81%
In addition, perf on qemu reports an important decrease in time spent
dealing with IPIs:
Before: 68.17% main [kernel.kallsyms] [k] __sbi_rfence_v02_call
After : 8.64% main [kernel.kallsyms] [k] __sbi_rfence_v02_call
* Benchmark:
int stick_this_thread_to_core(int core_id) {
int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
if (core_id < 0 || core_id >= num_cores)
return EINVAL;
Following the examples of cbom-block-size and cboz-block-size,
cbop-block-size is the cache size of Zicbop (cbo.prefetch) operations.
The most common case is to have all cache block sizes to be the same
size (e.g. profiles such as rva22u64 mandates a 64 bytes size for all
cache operations), but there's no specification requirement for that,
and an implementation can have different cache sizes for each operation.
Cc: Rob Herring <robh@kernel.org> Cc: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231029123500.739409-1-dbarboza@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Wed, 10 Jan 2024 17:54:29 +0000 (09:54 -0800)]
Merge patch series "riscv: errata: thead: use riscv_nonstd_cache_ops for CMO"
Jisheng Zhang <jszhang@kernel.org> says:
Previously, we use alternative mechanism to dynamically patch
the CMO operations for THEAD C906/C910 during boot for performance
reason. But as pointed out by Arnd, "there is already a significant
cost in accessing the invalidated cache lines afterwards, which is
likely going to be much higher than the cost of an indirect branch".
And indeed, there's no performance difference with GMAC and EMMC per
my test on Sipeed Lichee Pi 4A board.
Use riscv_nonstd_cache_ops for THEAD C906/C910 CMO to simplify
the alternative code, and to acchieve Arnd's goal -- "I think
moving the THEAD ops at the same level as all nonstandard operations
makes sense, but I'd still leave CMO as an explicit fast path that
avoids the indirect branch. This seems like the right thing to do both
for readability and for platforms on which the indirect branch has a
noticeable overhead."
To make bisect easy, I use two patches here: patch1 does the conversion
which just mimics current CMO behavior via. riscv_nonstd_cache_ops, I
assume no functionalities changes. patch2 uses T-HEAD PA based CMO
instructions so that we don't need to covert PA to VA.
* b4-shazam-merge:
riscv: errata: thead: use pa based instructions for CMO
riscv: errata: thead: use riscv_nonstd_cache_ops for CMO
The current description implies that only a single address translation
mode is available to the operating system. However, some implementations
support multiple address translation modes, and the operating system is
free to choose between them.
Per the RISC-V privileged specification, Sv48 implementations must also
implement Sv39, and likewise Sv57 implies support for Sv48. This means
it is possible to describe all supported address translation modes using
a single value, by naming the largest supported mode. This appears to
have been the intended usage of the property, so note it explicitly.
Fixes: 4fd669a8c487 ("dt-bindings: riscv: convert cpu binding to json-schema") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231227175739.1453782-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Wed, 10 Jan 2024 15:04:08 +0000 (07:04 -0800)]
Merge patch series "RISC-V SBI debug console extension support"
Anup Patel <apatel@ventanamicro.com> says:
The SBI v2.0 specification is now frozen. The SBI v2.0 specification defines
SBI debug console (DBCN) extension which replaces the legacy SBI v0.1
functions sbi_console_putchar() and sbi_console_getchar().
(Refer v2.0-rc5 at https://github.com/riscv-non-isa/riscv-sbi-doc/releases)
This series adds support for SBI debug console (DBCN) extension in
Linux RISC-V.
To try these patches with KVM RISC-V, use KVMTOOL from the
riscv_zbx_zicntr_smstateen_condops_v1 branch at:
https://github.com/avpatel/kvmtool.git
* b4-shazam-merge:
RISC-V: Enable SBI based earlycon support
tty: Add SBI debug console support to HVC SBI driver
tty/serial: Add RISC-V SBI debug console based earlycon
RISC-V: Add SBI debug console helper routines
RISC-V: Add stubs for sbi_console_putchar/getchar()
Andrew Jones [Wed, 6 Dec 2023 11:08:09 +0000 (12:08 +0100)]
riscv: sbi: Introduce system suspend support
When the SUSP SBI extension is present it implies that the standard
"suspend to RAM" type is available. Wire it up to the generic
platform suspend support, also applying the already present support
for non-retentive CPU suspend. When the kernel is built with
CONFIG_SUSPEND, one can do 'echo mem > /sys/power/state' to suspend.
Resumption will occur when a platform-specific wake-up event arrives.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231206110807.35882-4-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Wed, 10 Jan 2024 14:48:16 +0000 (06:48 -0800)]
Merge patch series "riscv: modules: Fix module loading error handling"
Charlie Jenkins <charlie@rivosinc.com> says:
When modules are loaded while there is not ample allocatable memory,
there was previously not proper error handling. This series fixes a
use-after-free error and a different issue that caused a non graceful
exit after memory was not properly allocated.
* b4-shazam-merge:
riscv: Fix relocation_hashtable size
riscv: Correctly free relocation hashtable on error
riscv: Fix module loading free order
Palmer Dabbelt [Wed, 10 Jan 2024 04:18:23 +0000 (20:18 -0800)]
Merge patch series "riscv: enable EFFICIENT_UNALIGNED_ACCESS and DCACHE_WORD_ACCESS"
Jisheng Zhang <jszhang@kernel.org> says:
Some riscv implementations such as T-HEAD's C906, C908, C910 and C920
support efficient unaligned access, for performance reason we want
to enable HAVE_EFFICIENT_UNALIGNED_ACCESS on these platforms. To
avoid performance regressions on non efficient unaligned access
platforms, HAVE_EFFICIENT_UNALIGNED_ACCESS can't be globally selected.
To solve this problem, runtime code patching based on the detected
speed is a good solution. But that's not easy, it involves lots of
work to modify vairous subsystems such as net, mm, lib and so on.
This can be done step by step.
So let's take an easier solution: add support to efficient unaligned
access and hide the support under NONPORTABLE.
patch1 introduces RISCV_EFFICIENT_UNALIGNED_ACCESS which depends on
NONPORTABLE, if users know during config time that the kernel will be
only run on those efficient unaligned access hw platforms, they can
enable it. Obviously, generic unified kernel Image shouldn't enable it.
patch2 adds support DCACHE_WORD_ACCESS when MMU and
RISCV_EFFICIENT_UNALIGNED_ACCESS.
Below test program and step shows how much performance can be improved:
Jisheng Zhang [Tue, 14 Nov 2023 14:33:37 +0000 (22:33 +0800)]
riscv: errata: thead: use riscv_nonstd_cache_ops for CMO
Previously, we use alternative mechanism to dynamically patch
the CMO operations for THEAD C906/C910 during boot for performance
reason. But as pointed out by Arnd, "there is already a significant
cost in accessing the invalidated cache lines afterwards, which is
likely going to be much higher than the cost of an indirect branch".
And indeed, there's no performance difference with GMAC and EMMC per
my test on Sipeed Lichee Pi 4A board.
Use riscv_nonstd_cache_ops for THEAD C906/C910 CMO to simplify
the alternative code, and to acchieve Arnd's goal -- "I think
moving the THEAD ops at the same level as all nonstandard operations
makes sense, but I'd still leave CMO as an explicit fast path that
avoids the indirect branch. This seems like the right thing to do both
for readability and for platforms on which the indirect branch has a
noticeable overhead."
Anup Patel [Fri, 24 Nov 2023 07:09:01 +0000 (12:39 +0530)]
RISC-V: Add stubs for sbi_console_putchar/getchar()
The functions sbi_console_putchar() and sbi_console_getchar() are
not defined when CONFIG_RISCV_SBI_V01 is disabled so let us add
stub of these functions to avoid "#ifdef" on user side.
Charlie Jenkins [Thu, 4 Jan 2024 19:42:48 +0000 (11:42 -0800)]
riscv: Correctly free relocation hashtable on error
When there is not enough allocatable memory for the relocation
hashtable, module loading should exit gracefully. Previously, this was
attempted to be accomplished by checking if an unsigned number is less
than zero which does not work. Instead have the caller check if the
hashtable was correctly allocated and add a comment explaining that
hashtable_bits that is 0 is valid.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Fixes: d8792a5734b0 ("riscv: Safely remove entries from relocation list") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202312132019.iYGTwW0L-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202312120044.wTI1Uyaa-lkp@intel.com/ Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20240104-module_loading_fix-v3-2-a71f8de6ce0f@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Jisheng Zhang [Mon, 25 Dec 2023 04:42:06 +0000 (12:42 +0800)]
riscv: introduce RISCV_EFFICIENT_UNALIGNED_ACCESS
Some riscv implementations such as T-HEAD's C906, C908, C910 and C920
support efficient unaligned access, for performance reason we want
to enable HAVE_EFFICIENT_UNALIGNED_ACCESS on these platforms. To
avoid performance regressions on other non efficient unaligned access
platforms, HAVE_EFFICIENT_UNALIGNED_ACCESS can't be globally selected.
To solve this problem, runtime code patching based on the detected
speed is a good solution. But that's not easy, it involves lots of
work to modify vairous subsystems such as net, mm, lib and so on.
This can be done step by step.
So let's take an easier solution: add support to efficient unaligned
access and hide the support under NONPORTABLE.
Now let's introduce RISCV_EFFICIENT_UNALIGNED_ACCESS which depends on
NONPORTABLE, if users know during config time that the kernel will be
only run on those efficient unaligned access hw platforms, they can
enable it. Obviously, generic unified kernel Image shouldn't enable it.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20231225044207.3821-2-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Wed, 10 Jan 2024 04:14:51 +0000 (20:14 -0800)]
Merge patch series "riscv: hwprobe: add Zicond, Zacas and Ztso support"
Clément Léger <cleger@rivosinc.com> says:
This series add support for a few more extensions that are present in
the RVA22U64/RVA23U64 (either mandatory or optional) and that are useful
for userspace:
- Zicond
- Zacas
- Ztso
Series currently based on riscv/for-next.
* b4-shazam-lts:
riscv: hwprobe: export Zicond extension
riscv: hwprobe: export Zacas ISA extension
riscv: add ISA extension parsing for Zacas
dt-bindings: riscv: add Zacas ISA extension description
riscv: hwprobe: export Ztso ISA extension
riscv: add ISA extension parsing for Ztso
Palmer Dabbelt [Wed, 10 Jan 2024 03:33:25 +0000 (19:33 -0800)]
Merge patch series "Fix XIP boot and make XIP testable in QEMU"
Frederik Haxel <haxel@fzi.de> says:
XIP boot seems to be broken for some time now. A likely reason why no one
seems to have noticed this is that XIP is more difficult to test, as it is
currently not easily testable with QEMU.
These patches fix the XIP boot and allow an XIP build without BUILTIN_DTB,
which in turn makes it easier to test an image with the QEMU virt machine.
* b4-shazam-merge:
riscv: Allow disabling of BUILTIN_DTB for XIP
riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro
riscv: Make XIP bootable again
Song Shuai [Mon, 11 Dec 2023 11:03:31 +0000 (19:03 +0800)]
riscv: Remove SHADOW_OVERFLOW_STACK_SIZE macro
The commit be97d0db5f44 ("riscv: VMAP_STACK overflow
detection thread-safe") got rid of `shadow_stack`,
so SHADOW_OVERFLOW_STACK_SIZE should be removed too.
Palmer Dabbelt [Wed, 10 Jan 2024 04:10:32 +0000 (20:10 -0800)]
Merge remote-tracking branch 'palmer/fixes' into for-next
I don't usually merge these in, but I missed sending a PR due to the
holidays.
* palmer/fixes:
riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC
riscv: Fix module_alloc() that did not reset the linear mapping permissions
riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping
riscv: Check if the code to patch lies in the exit section
riscv: errata: andes: Probe for IOCP only once in boot stage
riscv: Fix SMP when shadow call stacks are enabled
dt-bindings: perf: riscv,pmu: drop unneeded quotes
riscv: fix misaligned access handling of C.SWSP and C.SDSP
RISC-V: hwprobe: Always use u64 for extension bits
Support rv32 ULEB128 test
riscv: Correct type casting in module loading
riscv: Safely remove entries from relocation list
Ben Dooks [Thu, 23 Nov 2023 14:27:08 +0000 (14:27 +0000)]
riscv; fix __user annotation in save_v_state()
The save_v_state() is technically sending a __user pointer through
__put_user() and thus is generating a sparse warning so force the
value to be "void *" to fix:
arch/riscv/kernel/signal.c:94:16: warning: incorrect type in initializer (different address spaces)
arch/riscv/kernel/signal.c:94:16: expected void *__val
arch/riscv/kernel/signal.c:94:16: got void [noderef] __user *[assigned] datap
Ben Dooks [Thu, 23 Nov 2023 14:16:17 +0000 (14:16 +0000)]
riscv: fix __user annotation in traps_misaligned.c
The instruction reading code can read from either user or kernel addresses
and thus the use of __user on pointers to instructions depends on which
context. Fix a few sparse warnings by using __user for user-accesses and
remove it when not.
Fixes:
arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:322:24: warning: incorrect type in initializer (different address spaces)
arch/riscv/kernel/traps_misaligned.c:322:24: expected unsigned char const [noderef] __user *__gu_ptr
arch/riscv/kernel/traps_misaligned.c:322:24: got unsigned char const [usertype] *addr
arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:332:24: warning: incorrect type in initializer (different address spaces)
arch/riscv/kernel/traps_misaligned.c:332:24: expected unsigned char [noderef] __user *__gu_ptr
arch/riscv/kernel/traps_misaligned.c:332:24: got unsigned char [usertype] *addr
Jisheng Zhang [Thu, 23 Nov 2023 14:22:23 +0000 (22:22 +0800)]
riscv: Select ARCH_WANTS_NO_INSTR
As said in the help of ARCH_WANTS_NO_INSTR entry in arch/Kconfig:
"An architecture should select this if the noinstr macro is being used on
functions to denote that the toolchain should avoid instrumenting such
functions and is required for correctness."
Select ARCH_WANTS_NO_INSTR for correctness.
PS: The reason we didn't find any issue so far is that the
CC_HAS_NO_PROFILE_FN_ATTR is true.
Palmer Dabbelt [Thu, 4 Jan 2024 23:03:09 +0000 (15:03 -0800)]
Merge patch series "riscv: CPU operations cleanup"
Samuel Holland <samuel.holland@sifive.com> says:
This series cleans up some duplicated and dead code around the RISC-V
CPU operations, that was copied from arm64 but is not needed here. The
result is a bit of memory savings and removal of a few SBI calls during
boot, with no functional change.
* b4-shazam-merge:
riscv: Use the same CPU operations for all CPUs
riscv: Remove unused members from struct cpu_operations
riscv: Deduplicate code in setup_smp()
Samuel Holland [Tue, 21 Nov 2023 22:53:18 +0000 (14:53 -0800)]
riscv: Remove obsolete rv32_defconfig file
This file is not used since commit 72f045d19f25 ("riscv: Fixup
difference with defconfig"), where it was replaced by the
32-bit.config fragment. Delete the old file to avoid any confusion.
Palmer Dabbelt [Wed, 3 Jan 2024 12:09:39 +0000 (04:09 -0800)]
Merge patch series "RISC-V: hwprobe: Introduce which-cpus"
Andrew Jones <ajones@ventanamicro.com> says:
This series introduces a flag for the hwprobe syscall which effectively
reverses its behavior from getting the values of keys for a set of cpus
to getting the cpus for a set of key-value pairs.
* b4-shazam-merge:
RISC-V: selftests: Add which-cpus hwprobe test
RISC-V: hwprobe: Introduce which-cpus flag
RISC-V: Move the hwprobe syscall to its own file
RISC-V: hwprobe: Clarify cpus size parameter
Frederik Haxel [Tue, 12 Dec 2023 13:01:14 +0000 (14:01 +0100)]
riscv: Allow disabling of BUILTIN_DTB for XIP
This enables, among other things, testing with the QEMU virt machine.
To build an XIP kernel for the QEMU virt machine, configure the
the kernel as desired and apply the following configuration
```
CONFIG_NONPORTABLE=y
CONFIG_XIP_KERNEL=y
CONFIG_XIP_PHYS_ADDR=0x20000000
CONFIG_PHYS_RAM_BASE=0x80200000
CONFIG_BUILTIN_DTB=n
```
Since the QEMU virt flash memory expects a 32 MB file, the built image
must be padded. For example, with
`truncate -s 32M arch/riscv/boot/xipImage`
The kernel can be started using the following command in QEMU (v8+)
```
qemu-system-riscv64 -M virt,pflash0=pflash0 \
-blockdev node-name=pflash0,driver=file,read-only=on,\
filename=arch/riscv/boot/xipImage <optional parameters>
```
Frederik Haxel [Tue, 12 Dec 2023 13:01:13 +0000 (14:01 +0100)]
riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro
During the refactoring, a bug was introduced in the rarly used
XIP_FIXUP_FLASH_OFFSET macro.
Fixes: bee7fbc38579 ("RISC-V CPU Idle Support") Fixes: e7681beba992 ("RISC-V: Split out the XIP fixups into their own file") Signed-off-by: Frederik Haxel <haxel@fzi.de> Link: https://lore.kernel.org/r/20231212130116.848530-3-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Alexandre Ghiti [Wed, 13 Dec 2023 13:40:27 +0000 (14:40 +0100)]
riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC
When resetting the linear mapping permissions, we must make sure that we
clear the X bit so that do not end up with WX mappings (since we set
PAGE_KERNEL).
Alexandre Ghiti [Wed, 13 Dec 2023 13:40:26 +0000 (14:40 +0100)]
riscv: Fix module_alloc() that did not reset the linear mapping permissions
After unloading a module, we must reset the linear mapping permissions,
see the example below:
Before unloading a module:
0xffffaf809d65d000-0xffffaf809d6dc000 0x000000011d65d000 508K PTE . .. .. D A G . . W R V
0xffffaf809d6dc000-0xffffaf809d6dd000 0x000000011d6dc000 4K PTE . .. .. D A G . . . R V
0xffffaf809d6dd000-0xffffaf809d6e1000 0x000000011d6dd000 16K PTE . .. .. D A G . . W R V
0xffffaf809d6e1000-0xffffaf809d6e7000 0x000000011d6e1000 24K PTE . .. .. D A G . X . R V
After unloading a module:
0xffffaf809d65d000-0xffffaf809d6e1000 0x000000011d65d000 528K PTE . .. .. D A G . . W R V
0xffffaf809d6e1000-0xffffaf809d6e7000 0x000000011d6e1000 24K PTE . .. .. D A G . X W R V
The last mapping is not reset and we end up with WX mappings in the linear
mapping.
So add VM_FLUSH_RESET_PERMS to our module_alloc() definition.
Alexandre Ghiti [Tue, 12 Dec 2023 19:54:00 +0000 (20:54 +0100)]
riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping
lm_alias() can only be used on kernel mappings since it explicitly uses
__pa_symbol(), so simply fix this by checking where the address belongs
to before.
Fixes: 311cd2f6e253 ("riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings") Reported-by: syzbot+afb726d49f84c8d95ee1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-riscv/000000000000620dd0060c02c5e1@google.com/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231212195400.128457-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Samuel Holland [Tue, 21 Nov 2023 23:47:26 +0000 (15:47 -0800)]
riscv: Use the same CPU operations for all CPUs
RISC-V provides no binding (ACPI or DT) to describe per-cpu start/stop
operations, so cpu_set_ops() will always detect the same operations for
every CPU. Replace the cpu_ops array with a single pointer to save space
and reduce boot time.
Andrew Jones [Wed, 22 Nov 2023 16:47:05 +0000 (17:47 +0100)]
RISC-V: selftests: Add which-cpus hwprobe test
Test the RISCV_HWPROBE_WHICH_CPUS flag of hwprobe. The test also
has a command line interface in order to get the cpu list for
arbitrary hwprobe pairs.
Andrew Jones [Wed, 22 Nov 2023 16:47:04 +0000 (17:47 +0100)]
RISC-V: hwprobe: Introduce which-cpus flag
Introduce the first flag for the hwprobe syscall. The flag basically
reverses its behavior, i.e. instead of populating the values of keys
for a given set of cpus, the set of cpus after the call is the result
of finding a set which supports the values of the keys. In order to
do this, we implement a pair compare function which takes the type of
value (a single value vs. a bitmask of booleans) into consideration.
We also implement vdso support for the new flag.
Andrew Jones [Wed, 22 Nov 2023 16:47:02 +0000 (17:47 +0100)]
RISC-V: hwprobe: Clarify cpus size parameter
The "count" parameter associated with the 'cpus' parameter of the
hwprobe syscall is the size in bytes of 'cpus'. Naming it 'cpu_count'
may mislead users (it did me) to think it's the number of CPUs that
are or can be represented by 'cpus' instead. This is particularly
easy (IMO) to get wrong since 'cpus' is documented to be defined by
CPU_SET(3) and CPU_SET(3) also documents a CPU_COUNT() (the number
of CPUs in set) macro. CPU_SET(3) refers to the size of cpu sets
with 'setsize'. Adopt 'cpusetsize' for the hwprobe parameter and
specifically state it is in bytes in Documentation/riscv/hwprobe.rst
to clarify.
Palmer Dabbelt [Fri, 10 Nov 2023 17:59:03 +0000 (09:59 -0800)]
RISC-V: Remove the removed single-letter extensions
There were a few single-letter extensions that we had references to
floating around in the kernel, but that never ended up as actual ISA
specs and have mostly been replaced by multi-letter extensions. This
removes the references to those extensions.
Palmer Dabbelt [Wed, 20 Dec 2023 18:48:17 +0000 (10:48 -0800)]
Merge patch series "riscv: Use READ_ONCE()/WRITE_ONCE() for pte accesses"
Alexandre Ghiti <alexghiti@rivosinc.com> says:
This series is a follow-up for riscv of a recent series from Ryan [1] which
converts all direct dereferences of pte_t into a ptet_get() access.
The goal here for riscv is to use READ_ONCE()/WRITE_ONCE() for all page
table entries accesses to avoid any compiler transformation when the
hardware can concurrently modify the page tables entries (A/D bits for
example).
I went a bit further and added pud/p4d/pgd_get() helpers as such concurrent
modifications can happen too at those levels.
* b4-shazam-merge:
riscv: Use accessors to page table entries instead of direct dereference
riscv: mm: Only compile pgtable.c if MMU
mm: Introduce pudp/p4dp/pgdp_get() functions
riscv: Use WRITE_ONCE() when setting page table entries
Alexandre Ghiti [Wed, 13 Dec 2023 20:30:01 +0000 (21:30 +0100)]
riscv: Use accessors to page table entries instead of direct dereference
As very well explained in commit 20a004e7b017 ("arm64: mm: Use
READ_ONCE/WRITE_ONCE when accessing page tables"), an architecture whose
page table walker can modify the PTE in parallel must use
READ_ONCE()/WRITE_ONCE() macro to avoid any compiler transformation.
So apply that to riscv which is such architecture.
Alexandre Ghiti [Wed, 13 Dec 2023 20:29:59 +0000 (21:29 +0100)]
mm: Introduce pudp/p4dp/pgdp_get() functions
Instead of directly dereferencing page tables entries, which can cause
issues (see commit 20a004e7b017 ("arm64: mm: Use READ_ONCE/WRITE_ONCE when
accessing page tables"), let's introduce new functions to get the
pud/p4d/pgd entries (the pte and pmd versions already exist).
Note that arm pgd_t is actually an array so pgdp_get() is defined as a
macro to avoid a build error.
Those new functions will be used in subsequent commits by the riscv
architecture.
Alexandre Ghiti [Wed, 13 Dec 2023 20:29:58 +0000 (21:29 +0100)]
riscv: Use WRITE_ONCE() when setting page table entries
To avoid any compiler "weirdness" when accessing page table entries which
are concurrently modified by the HW, let's use WRITE_ONCE() macro
(commit 20a004e7b017 ("arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing
page tables") gives a great explanation with more details).
Palmer Dabbelt [Thu, 7 Dec 2023 15:33:36 +0000 (07:33 -0800)]
Merge patch series "riscv: report more ISA extensions through hwprobe"
Clément Léger <cleger@rivosinc.com> says:
In order to be able to gather more information about the supported ISA
extensions from userspace using the hwprobe syscall, add more ISA
extensions report. This series adds the following ISA extensions parsing
support:
Some of these extensions are actually shorthands for other "sub"
extensions. This series includes a patch from Conor/Evan that adds a way
to specify such "bundled" extensions. When exposing these bundled
extensions to userspace through hwprobe, only the "sub" extensions are
exposed.
In order to test it, one can use qemu and the small hwprobe utility
provided[1]. Run qemu by specifying additional ISA extensions, for
instance:
$ qemu-system-riscv64 -cpu rv64,v=true,zk=true,zvksh=true,zvkned=true
<whatever options you want>
* b4-shazam-merge:
dt-bindings: riscv: add Zfa ISA extension description
riscv: hwprobe: export Zfa ISA extension
riscv: add ISA extension parsing for Zfa
dt-bindings: riscv: add Zvfh[min] ISA extension description
riscv: hwprobe: export Zvfh[min] ISA extensions
riscv: add ISA extension parsing for Zvfh[min]
dt-bindings: riscv: add Zihintntl ISA extension description
riscv: hwprobe: export Zhintntl ISA extension
riscv: add ISA extension parsing for Zihintntl
dt-bindings: riscv: add Zfh[min] ISA extensions description
riscv: hwprobe: export Zfh[min] ISA extensions
riscv: add ISA extension parsing for Zfh/Zfh[min]
dt-bindings: riscv: add vector crypto ISA extensions description
riscv: hwprobe: export vector crypto ISA extensions
riscv: add ISA extension parsing for vector crypto
dt-bindings: riscv: add scalar crypto ISA extensions description
riscv: hwprobe: add support for scalar crypto ISA extensions
riscv: add ISA extension parsing for scalar crypto
riscv: hwprobe: export missing Zbc ISA extension
riscv: add ISA extension parsing for Zbc
Clément Léger [Tue, 14 Nov 2023 14:12:51 +0000 (09:12 -0500)]
riscv: add ISA extension parsing for Zvfh[min]
Add parsing for Zvfh[min] ISA extension[1] which were ratified in
june 2023 around commit e2ccd0548d6c ("Remove draft warnings from
Zvfh[min]") in riscv-v-spec[2].
Clément Léger [Tue, 14 Nov 2023 14:12:43 +0000 (09:12 -0500)]
riscv: hwprobe: export vector crypto ISA extensions
Export Zv* vector crypto ISA extensions that were added in "RISC-V
Cryptography Extensions Volume II" specification[1] through hwprobe.
This adds support for the following instructions:
Clément Léger [Tue, 14 Nov 2023 14:12:42 +0000 (09:12 -0500)]
riscv: add ISA extension parsing for vector crypto
Add parsing of some Zv* vector crypto ISA extensions that are mentioned
in "RISC-V Cryptography Extensions Volume II" [1]. These ISA extensions
are the following:
- Zvbb: Vector Basic Bit-manipulation
- Zvbc: Vector Carryless Multiplication
- Zvkb: Vector Cryptography Bit-manipulation
- Zvkg: Vector GCM/GMAC.
- Zvkned: NIST Suite: Vector AES Block Cipher
- Zvknh[ab]: NIST Suite: Vector SHA-2 Secure Hash
- Zvksed: ShangMi Suite: SM4 Block Cipher
- Zvksh: ShangMi Suite: SM3 Secure Hash
- Zvkn: NIST Algorithm Suite
- Zvknc: NIST Algorithm Suite with carryless multiply
- Zvkng: NIST Algorithm Suite with GCM.
- Zvks: ShangMi Algorithm Suite
- Zvksc: ShangMi Algorithm Suite with carryless multiplication
- Zvksg: ShangMi Algorithm Suite with GCM.
- Zvkt: Vector Data-Independent Execution Latency.