ghidra/Ghidra/Processors
Nicolas Iooss 24d19f6e8c Add eBPF ISA v4 instructions
In 2023, the eBPF instruction set was modified to add several
instructions related to signed operations (load with sign-extension,
signed division, etc.), a 32-bit jump instruction and some byte-swap
instructions. This became version 4 of eBPF ISA.

Here are some references about this change:

- https://pchaigno.github.io/bpf/2021/10/20/ebpf-instruction-sets.html
  (a blog post about eBPF instruction set extensions)
- https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/
  (documentation sent to Linux Kernel mailing list)
- https://www.rfc-editor.org/rfc/rfc9669.html#name-sign-extension-load-operati
  (IETF's BPF Instruction Set Architecture standard defined the new
  instructions)
- https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.14#n1859
  (implementation of signed division and remainder in Linux kernel.
  This shows that 32-bit signed DIV and signed MOD are zero-extending
  the result in DST)
- https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.14#n2135
  (implementation of signed memory load in Linux kernel)
- https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f9a1ea821ff25353a0e80d971e7958cd55b47a3
  (commit which added signed memory load instructions in Linux kernel)

This can be tested with a recent enough version of clang and LLVM (this
works with clang 19.1.4 on Alpine 3.21).
For example for signed memory load instructions:

    signed int sext_8bit(signed char x) {
        return x;
    }

produces:

    $ clang -O0 -target bpf -mcpu=v4 -c test.c -o test.ebpf
    $ llvm-objdump -rd test.ebpf
    ...
    0000000000000000 <sext_8bit>:
           0:  73 1a ff ff 00 00 00 00  *(u8 *)(r10 - 0x1) = r1
           1:  91 a1 ff ff 00 00 00 00  r1 = *(s8 *)(r10 - 0x1)
           2:  bc 10 00 00 00 00 00 00  w0 = w1
           3:  95 00 00 00 00 00 00 00  exit

(The second instruction is a signed memory load)

Instruction MOVS (Sign extend register MOV) uses offset to encode the
conversion (whether the source register is to be considered as signed
8-bit, 16-bit or 32-bit integer). The mnemonic for these instructions is
quite unclear:

- They are all named MOVS in the proposal
  https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/
- LLVM and Linux disassemblers only display pseudo-code (`r0 = (s8)r1`)
- RFC 9669 (https://datatracker.ietf.org/doc/rfc9669/) uses MOVSX for
  all instructions.
- GCC uses MOVS for all instructions:
  https://github.com/gcc-mirror/gcc/blob/releases/gcc-14.1.0/gcc/config/bpf/bpf.md?plain=1#L326-L365

To make the disassembled code clearer, decode such instructions with a
size suffix: MOVSB, MOVSH, MOVSW.

The decoding of instructions 32-bit JA, BSWAP16, BSWAP32 and BSWAP64 is
straightforward.
2025-07-29 12:45:06 +00:00
..
6502 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
8048 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
8051 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
8085 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
68000 GP-5804 Set SymbolicPropogator to record register begin/end state in 2025-07-03 17:49:53 +00:00
AARCH64 Merge remote-tracking branch 'origin/GP-5804_emteere_FixDefaultSymbolicPropRecordState' into patch 2025-07-18 06:15:13 -04:00
ARM Merge remote-tracking branch 'origin/GP-5804_emteere_FixDefaultSymbolicPropRecordState' into patch 2025-07-18 06:15:13 -04:00
Atmel Many typo's 2025-04-19 18:06:41 +02:00
BPF GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
CP1600 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
CR16 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
Dalvik GP-0: Certify 2024-11-26 08:54:23 -05:00
DATA GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
eBPF Add eBPF ISA v4 instructions 2025-07-29 12:45:06 +00:00
HCS08 Merge remote-tracking branch 2025-07-29 08:16:11 -04:00
HCS12 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
JVM GP-5742 Cleanup preferred CommentType enum use. Changed SARIF data component comment JSON serialization from int to String. 2025-06-06 17:58:07 -04:00
Loongarch GP-5051: Distinct qemu-system launcher. 2024-12-04 08:43:26 -05:00
M8C GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
M16C GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
MC6800 Merge remote-tracking branch 2025-07-29 08:16:11 -04:00
MCS96 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
MIPS GP-5843 Added MIPS64 function start patterns 2025-07-17 22:42:00 +00:00
PA-RISC GP-0 Corrected build.gradle for PA-RISC to allow pcode test execution 2025-06-09 18:53:22 -04:00
PIC Merge branch 'GP-0_ryanmkurtz_PR-6204_antoniovazquezblanco_pic' 2025-03-06 11:32:36 -05:00
PowerPC Merge remote-tracking branch 'origin/GP-5846_ghidra1_PPC64_ELFRelocations' into patch 2025-07-18 15:17:45 -04:00
RISCV Fix RISC-V minu and max instructions' definitions (Closes #8215) 2025-06-11 11:47:49 -04:00
Sparc GP-5051: Distinct qemu-system launcher. 2024-12-04 08:43:26 -05:00
SuperH GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
SuperH4 Merge remote-tracking branch 2025-05-21 16:14:07 -04:00
TI_MSP430 GP-5189 Add range attributes to VarargsFilter 2024-12-10 16:39:22 +00:00
Toy GP-5877: Fix Patch Instruction action in some Harvard architectures. 2025-07-28 15:48:40 +00:00
tricore Merge remote-tracking branch 'origin/patch' 2025-03-04 13:06:23 -05:00
V850 GP-5078: Improvements to Ghidra Module directory layout 2024-10-31 10:34:26 -04:00
x86 GP-5725: Corrected operands for several AVX512 instructions 2025-06-10 09:21:39 -04:00
Xtensa GP-5051: Distinct qemu-system launcher. 2024-12-04 08:43:26 -05:00
Z80 GP-5659: Fixed z80 sub instruction semantics 2025-05-13 14:24:39 +00:00