mirror of https://github.com/NationalSecurityAgency/ghidra.git synced 2025-10-03 01:39:21 +02:00

History

Nicolas Iooss 24d19f6e8c Add eBPF ISA v4 instructions In 2023, the eBPF instruction set was modified to add several instructions related to signed operations (load with sign-extension, signed division, etc.), a 32-bit jump instruction and some byte-swap instructions. This became version 4 of eBPF ISA. Here are some references about this change: - https://pchaigno.github.io/bpf/2021/10/20/ebpf-instruction-sets.html (a blog post about eBPF instruction set extensions) - https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/ (documentation sent to Linux Kernel mailing list) - https://www.rfc-editor.org/rfc/rfc9669.html#name-sign-extension-load-operati (IETF's BPF Instruction Set Architecture standard defined the new instructions) - https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.14#n1859 (implementation of signed division and remainder in Linux kernel. This shows that 32-bit signed DIV and signed MOD are zero-extending the result in DST) - https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.14#n2135 (implementation of signed memory load in Linux kernel) - https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f9a1ea821ff25353a0e80d971e7958cd55b47a3 (commit which added signed memory load instructions in Linux kernel) This can be tested with a recent enough version of clang and LLVM (this works with clang 19.1.4 on Alpine 3.21). For example for signed memory load instructions: signed int sext_8bit(signed char x) { return x; } produces: $ clang -O0 -target bpf -mcpu=v4 -c test.c -o test.ebpf $ llvm-objdump -rd test.ebpf ... 0000000000000000 <sext_8bit>: 0: 73 1a ff ff 00 00 00 00 (u8 )(r10 - 0x1) = r1 1: 91 a1 ff ff 00 00 00 00 r1 = (s8 )(r10 - 0x1) 2: bc 10 00 00 00 00 00 00 w0 = w1 3: 95 00 00 00 00 00 00 00 exit (The second instruction is a signed memory load) Instruction MOVS (Sign extend register MOV) uses offset to encode the conversion (whether the source register is to be considered as signed 8-bit, 16-bit or 32-bit integer). The mnemonic for these instructions is quite unclear: - They are all named MOVS in the proposal https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/ - LLVM and Linux disassemblers only display pseudo-code (`r0 = (s8)r1`) - RFC 9669 (https://datatracker.ietf.org/doc/rfc9669/) uses MOVSX for all instructions. - GCC uses MOVS for all instructions: https://github.com/gcc-mirror/gcc/blob/releases/gcc-14.1.0/gcc/config/bpf/bpf.md?plain=1#L326-L365 To make the disassembled code clearer, decode such instructions with a size suffix: MOVSB, MOVSH, MOVSW. The decoding of instructions 32-bit JA, BSWAP16, BSWAP32 and BSWAP64 is straightforward.		2025-07-29 12:45:06 +00:00
..
data/languages	Add eBPF ISA v4 instructions	2025-07-29 12:45:06 +00:00
src/main/java/ghidra/app	Fix eBPF CALL operand decoding	2025-07-07 16:26:31 +02:00
build.gradle	GP-2257 minor refactoring to collapse constructors, added sleigh lint	2023-04-29 21:56:45 +00:00
certification.manifest	Add support for big endian eBPF programs	2025-07-07 16:13:37 +02:00
Module.manifest	eBPF processor support	2023-04-10 00:54:28 +03:00
README.md	GP-5078: Improvements to Ghidra Module directory layout	2024-10-31 10:34:26 -04:00

README.md

eBPF