]> git.dujemihanovic.xyz Git - linux.git/commit
arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrs
authorPuranjay Mohan <puranjay12@gmail.com>
Thu, 2 May 2024 15:18:53 +0000 (15:18 +0000)
committerAlexei Starovoitov <ast@kernel.org>
Sun, 12 May 2024 23:54:34 +0000 (16:54 -0700)
commit7a4c32222b0e14349a6311e72bf6ebd3e1d1064b
tree9ddaf85ad93083fce7617d27bdae282c52eb2e16
parent2ddec2c80b4402c293c7e6e0881cecaaf77e8cec
arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrs

Support an instruction for resolving absolute addresses of per-CPU
data from their per-CPU offsets. This instruction is internal-only and
users are not allowed to use them directly. They will only be used for
internal inlining optimizations for now between BPF verifier and BPF
JITs.

Since commit 7158627686f0 ("arm64: percpu: implement optimised pcpu
access using tpidr_el1"), the per-cpu offset for the CPU is stored in
the tpidr_el1/2 register of that CPU.

To support this BPF instruction in the ARM64 JIT, the following ARM64
instructions are emitted:

mov dst, src // Move src to dst, if src != dst
mrs tmp, tpidr_el1/2 // Move per-cpu offset of the current cpu in tmp.
add dst, dst, tmp // Add the per cpu offset to the dst.

To measure the performance improvement provided by this change, the
benchmark in [1] was used:

Before:
glob-arr-inc   :   23.597 ± 0.012M/s
arr-inc        :   23.173 ± 0.019M/s
hash-inc       :   12.186 ± 0.028M/s

After:
glob-arr-inc   :   23.819 ± 0.034M/s
arr-inc        :   23.285 ± 0.017M/s
hash-inc       :   12.419 ± 0.011M/s

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-4-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
arch/arm64/include/asm/insn.h
arch/arm64/lib/insn.c
arch/arm64/net/bpf_jit.h
arch/arm64/net/bpf_jit_comp.c