mirror of
https://github.com/NationalSecurityAgency/ghidra.git
synced 2025-10-03 17:59:46 +02:00
GP-5018: Some updated PyGhidra docs
This commit is contained in:
parent
7fbf64ea70
commit
66a43cd6ed
2 changed files with 223 additions and 87 deletions
|
@ -1 +1,11 @@
|
||||||
# PyGhidra
|
# PyGhidra
|
||||||
|
|
||||||
|
This module provides the following capabilities:
|
||||||
|
* The [PyGhidra Python library](src/main/py/README.md) and its dependencies.
|
||||||
|
* A [Plugin](src/main/java/ghidra/pyghidra/PyGhidraPlugin.java) that provides a CPython interpreter.
|
||||||
|
* A [ScriptProvider](src/main/java/ghidra/pyghidra/PyGhidraScriptProvider.java) capable of running
|
||||||
|
GhidraScripts written in native CPython 3.
|
||||||
|
* An [interactive python script](support/pyghidra_launcher.py) that Ghidra uses to install
|
||||||
|
and launch PyGhidra. This script handles
|
||||||
|
[virtual environments](https://docs.python.org/3/tutorial/venv.html) and
|
||||||
|
[externally managed environments](https://packaging.python.org/en/latest/specifications/externally-managed-environments/).
|
|
@ -1,22 +1,59 @@
|
||||||
# PyGhidra
|
# PyGhidra
|
||||||
|
|
||||||
PyGhidra is a Python library that provides direct access to the Ghidra API within a native CPython interpreter using [jpype](https://jpype.readthedocs.io/en/latest). As well, PyGhidra contains some conveniences for setting up analysis on a given sample and running a Ghidra script locally. It also contains a Ghidra plugin to allow the use of CPython from the Ghidra user interface.
|
The PyGhidra Python library, originally developed by the
|
||||||
|
[Department of Defense Cyber Crime Center (DC3)](https://www.dc3.mil) under the name "Pyhidra", is a
|
||||||
|
Python library that provides direct access to the Ghidra API within a native CPython 3 interpreter
|
||||||
|
using [JPype](https://jpype.readthedocs.io/en/latest). PyGhidra contains some conveniences for
|
||||||
|
setting up analysis on a given sample and running a Ghidra script locally. It also contains a Ghidra
|
||||||
|
plugin to allow the use of CPython 3 from the Ghidra GUI.
|
||||||
|
|
||||||
PyGhidra was initially developed for use with Dragodis and is designed to be installable without requiring Java or Ghidra. This allows other Python projects
|
## Installation and Setup
|
||||||
have PyGhidra as a dependency and provide optional Ghidra functionality without requiring all users to install Java and Ghidra. It is recommended to recommend that users set the `GHIDRA_INSTALL_DIR` environment variable to simplify locating Ghidra.
|
Ghidra provides an out-of-the box integraton with the PyGhidra Python library which makes
|
||||||
|
installation and usage fairly straighforward. This enables the Ghidra GUI and headless Ghidra to run
|
||||||
|
GhidraScript's written in native CPython 3, as well as interact with the Ghidra GUI through a
|
||||||
|
built-in REPL. To launch Ghidra in PyGhidra-mode, see Ghidra's latest
|
||||||
|
[Installation Guide](https://github.com/NationalSecurityAgency/ghidra/blob/master/GhidraDocs/InstallationGuide.md#pyghidra-mode).
|
||||||
|
|
||||||
|
It is also possible (and encouraged!) to use PyGhidra as a standalone Python library for usage
|
||||||
|
in reverse engineering workflows where Ghidra may be one of many components involved. The following
|
||||||
|
instructions in this document focus on this type of usage.
|
||||||
|
|
||||||
## Usage
|
To install the PyGhidra Python library:
|
||||||
|
1. Download and install
|
||||||
|
[Ghidra 11.3 or later](https://github.com/NationalSecurityAgency/ghidra/releases) to a desired
|
||||||
|
location.
|
||||||
|
2. Set the `GHIDRA_INSTALL_DIR` environment variable to point to the directory where Ghidra is
|
||||||
|
installed.
|
||||||
|
3. Install PyGhidra:
|
||||||
|
* Online: `pip install pyghidra`
|
||||||
|
* Offline: `python3 -m pip install --no-index -f
|
||||||
|
<GhidraInstallDir>/Ghidra/Features/PyGhidra/pypkg/dist pyghidra`
|
||||||
|
|
||||||
|
## API
|
||||||
|
The current version of PyGhidra inherits an API from the original "Pyhidra" project that provides an
|
||||||
|
excellent starting point for interacting with a Ghidra installation. __NOTE:__ These functions are
|
||||||
|
subject to change in the future as more thought and feedback is collected on PyGhidra's role in the
|
||||||
|
greater Ghidra ecosystem:
|
||||||
|
|
||||||
### Raw Connection
|
### pyghidra.start()
|
||||||
|
To get a raw connection to Ghidra use the `start()` function. This will setup a JPype connection and
|
||||||
|
initialize Ghidra in headless mode, which will allow you to directly import `ghidra` and `java`.
|
||||||
|
|
||||||
To get a raw connection to Ghidra use the `start()` function.
|
__NOTE:__ No projects or programs get setup in this mode.
|
||||||
This will setup a Jpype connection and initialize Ghidra in headless mode,
|
|
||||||
which will allow you to directly import `ghidra` and `java`.
|
|
||||||
|
|
||||||
*NOTE: No projects or programs get setup in this mode.*
|
```python
|
||||||
|
def start(verbose=False, *, install_dir: Path = None) -> "PyGhidraLauncher":
|
||||||
|
"""
|
||||||
|
Starts the JVM and fully initializes Ghidra in Headless mode.
|
||||||
|
|
||||||
|
:param verbose: Enable verbose output during JVM startup (Defaults to False)
|
||||||
|
:param install_dir: The path to the Ghidra installation directory.
|
||||||
|
(Defaults to the GHIDRA_INSTALL_DIR environment variable)
|
||||||
|
:return: The PhyidraLauncher used to start the JVM
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Example:
|
||||||
```python
|
```python
|
||||||
import pyghidra
|
import pyghidra
|
||||||
pyghidra.start()
|
pyghidra.start()
|
||||||
|
@ -30,78 +67,63 @@ from java.lang import String
|
||||||
# do things
|
# do things
|
||||||
```
|
```
|
||||||
|
|
||||||
### Customizing Java and Ghidra initialization
|
### pyghidra.started()
|
||||||
|
To check to see if PyGhidra has been started, use the `started()` function.
|
||||||
JVM configuration for the classpath and vmargs may be done through a `PyGhidraLauncher`.
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from pyghidra.launcher import HeadlessPyGhidraLauncher
|
def started() -> bool:
|
||||||
|
"""
|
||||||
launcher = HeadlessPyGhidraLauncher()
|
Whether the PyGhidraLauncher has already started.
|
||||||
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
|
"""
|
||||||
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
|
|
||||||
launcher.start()
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Registering an Entry Point
|
#### Example:
|
||||||
|
|
||||||
The `PyGhidraLauncher` can also be configured through the use of a registered entry point on your own python project.
|
|
||||||
This is useful for installing your own Ghidra plugin which uses PyGhidra and self-compiles.
|
|
||||||
|
|
||||||
First create an [entry_point](https://setuptools.pypa.io/en/latest/userguide/entry_point.html) for `pyghidra.setup`
|
|
||||||
pointing to a single argument function which accepts the launcher instance.
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# setup.py
|
|
||||||
from setuptools import setup
|
|
||||||
|
|
||||||
setup(
|
|
||||||
# ...,
|
|
||||||
entry_points={
|
|
||||||
'pyghidra.setup': [
|
|
||||||
'acme_plugin = acme.ghidra_plugin.install:setup',
|
|
||||||
]
|
|
||||||
}
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
Then we create the target function.
|
|
||||||
This function will be called every time a user starts a PyGhidra launcher.
|
|
||||||
In the same fashion, another entry point `pyghidra.pre_launch` may be registered and will be called after Ghidra and all
|
|
||||||
plugins have been loaded.
|
|
||||||
|
|
||||||
```python
|
|
||||||
# acme/ghidra_plugin/install.py
|
|
||||||
from pathlib import Path
|
|
||||||
import pyghidra
|
import pyghidra
|
||||||
|
|
||||||
def setup(launcher):
|
if pyghidra.started():
|
||||||
"""
|
...
|
||||||
Run by PyGhidra launcher to install our plugin.
|
|
||||||
"""
|
|
||||||
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
|
|
||||||
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
|
|
||||||
|
|
||||||
# Install our plugin.
|
|
||||||
source_path = Path(__file__).parent / "java" / "plugin" # path to uncompiled .java code
|
|
||||||
details = pyghidra.ExtensionDetails(
|
|
||||||
name="acme_plugin",
|
|
||||||
description="My Cool Plugin",
|
|
||||||
author="acme",
|
|
||||||
plugin_version="1.2",
|
|
||||||
)
|
|
||||||
launcher.install_plugin(source_path, details) # install plugin (if not already)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### pyghidra.open_program()
|
||||||
### Analyze a File
|
To have PyGhidra setup a binary file for you, use the `open_program()` function. This will setup a
|
||||||
|
Ghidra project and import the given binary file as a program for you.
|
||||||
To have PyGhidra setup a binary file for you, use the `open_program()` function.
|
|
||||||
This will setup a Ghidra project and import the given binary file as a program for you.
|
|
||||||
|
|
||||||
Again, this will also allow you to import `ghidra` and `java` to perform more advanced processing.
|
Again, this will also allow you to import `ghidra` and `java` to perform more advanced processing.
|
||||||
|
|
||||||
|
```python
|
||||||
|
def open_program(
|
||||||
|
binary_path: Union[str, Path],
|
||||||
|
project_location: Union[str, Path] = None,
|
||||||
|
project_name: str = None,
|
||||||
|
analyze=True,
|
||||||
|
language: str = None,
|
||||||
|
compiler: str = None,
|
||||||
|
loader: Union[str, JClass] = None
|
||||||
|
) -> ContextManager["FlatProgramAPI"]: # type: ignore
|
||||||
|
"""
|
||||||
|
Opens given binary path in Ghidra and returns FlatProgramAPI object.
|
||||||
|
|
||||||
|
:param binary_path: Path to binary file, may be None.
|
||||||
|
:param project_location: Location of Ghidra project to open/create.
|
||||||
|
(Defaults to same directory as binary file)
|
||||||
|
:param project_name: Name of Ghidra project to open/create.
|
||||||
|
(Defaults to name of binary file suffixed with "_ghidra")
|
||||||
|
:param analyze: Whether to run analysis before returning.
|
||||||
|
:param language: The LanguageID to use for the program.
|
||||||
|
(Defaults to Ghidra's detected LanguageID)
|
||||||
|
:param compiler: The CompilerSpecID to use for the program. Requires a provided language.
|
||||||
|
(Defaults to the Language's default compiler)
|
||||||
|
:param loader: The `ghidra.app.util.opinion.Loader` class to use when importing the program.
|
||||||
|
This may be either a Java class or its path. (Defaults to None)
|
||||||
|
:return: A Ghidra FlatProgramAPI object.
|
||||||
|
:raises ValueError: If the provided language, compiler or loader is invalid.
|
||||||
|
:raises TypeError: If the provided loader does not implement `ghidra.app.util.opinion.Loader`.
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Example:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import pyghidra
|
import pyghidra
|
||||||
|
|
||||||
|
@ -113,11 +135,12 @@ with pyghidra.open_program("binary_file.exe") as flat_api:
|
||||||
# We are also free to import ghidra while in this context to do more advanced things.
|
# We are also free to import ghidra while in this context to do more advanced things.
|
||||||
from ghidra.app.decompiler.flatapi import FlatDecompilerAPI
|
from ghidra.app.decompiler.flatapi import FlatDecompilerAPI
|
||||||
decomp_api = FlatDecompilerAPI(flat_api)
|
decomp_api = FlatDecompilerAPI(flat_api)
|
||||||
# ...
|
...
|
||||||
decomp_api.dispose()
|
decomp_api.dispose()
|
||||||
```
|
```
|
||||||
|
|
||||||
By default, PyGhidra will run analysis for you. If you would like to do this yourself, set `analyze` to `False`.
|
By default, PyGhidra will run analysis for you. If you would like to do this yourself, set `analyze`
|
||||||
|
to `False`.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import pyghidra
|
import pyghidra
|
||||||
|
@ -130,28 +153,65 @@ with pyghidra.open_program("binary_file.exe", analyze=False) as flat_api:
|
||||||
flat_api.analyzeAll(program)
|
flat_api.analyzeAll(program)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
The `open_program()` function can also accept optional arguments to control the project name and
|
||||||
The `open_program()` function can also accept optional arguments to control the project name and location that gets created.
|
location that gets created (helpful for opening up a sample in an already existing project).
|
||||||
(Helpful for opening up a sample in an already existing project.)
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import pyghidra
|
import pyghidra
|
||||||
|
|
||||||
with pyghidra.open_program("binary_file.exe", project_name="EXAM_231", project_location=r"C:\exams\231") as flat_api:
|
with pyghidra.open_program("binary_file.exe", project_name="MyProject", project_location=r"C:\projects") as flat_api:
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### pyghidra.run_script()
|
||||||
### Run a Script
|
PyGhidra can also be used to run an existing Ghidra Python script directly in your native CPython
|
||||||
|
interpreter using the `run_script()` function. However, while you can technically run an existing
|
||||||
PyGhidra can also be used to run an existing Ghidra Python script directly in your native python interpreter
|
Ghidra script unmodified, you may run into issues due to differences between Jython 2 and
|
||||||
using the `run_script()` command.
|
CPython 3/JPype. Therefore, some modification to the script may be needed.
|
||||||
However, while you can technically run an existing Ghidra script unmodified, you may
|
|
||||||
run into issues due to differences between Jython 2 and CPython 3.
|
|
||||||
Therefore, some modification to the script may be needed.
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
|
def run_script(
|
||||||
|
binary_path: Optional[Union[str, Path]],
|
||||||
|
script_path: Union[str, Path],
|
||||||
|
project_location: Union[str, Path] = None,
|
||||||
|
project_name: str = None,
|
||||||
|
script_args: List[str] = None,
|
||||||
|
verbose=False,
|
||||||
|
analyze=True,
|
||||||
|
lang: str = None,
|
||||||
|
compiler: str = None,
|
||||||
|
loader: Union[str, JClass] = None,
|
||||||
|
*,
|
||||||
|
install_dir: Path = None
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Runs a given script on a given binary path.
|
||||||
|
|
||||||
|
:param binary_path: Path to binary file, may be None.
|
||||||
|
:param script_path: Path to script to run.
|
||||||
|
:param project_location: Location of Ghidra project to open/create.
|
||||||
|
(Defaults to same directory as binary file if None)
|
||||||
|
:param project_name: Name of Ghidra project to open/create.
|
||||||
|
(Defaults to name of binary file suffixed with "_ghidra" if None)
|
||||||
|
:param script_args: Command line arguments to pass to script.
|
||||||
|
:param verbose: Enable verbose output during Ghidra initialization.
|
||||||
|
:param analyze: Whether to run analysis, if a binary_path is provided, before running the script.
|
||||||
|
:param lang: The LanguageID to use for the program.
|
||||||
|
(Defaults to Ghidra's detected LanguageID)
|
||||||
|
:param compiler: The CompilerSpecID to use for the program. Requires a provided language.
|
||||||
|
(Defaults to the Language's default compiler)
|
||||||
|
:param loader: The `ghidra.app.util.opinion.Loader` class to use when importing the program.
|
||||||
|
This may be either a Java class or its path. (Defaults to None)
|
||||||
|
:param install_dir: The path to the Ghidra installation directory. This parameter is only
|
||||||
|
used if Ghidra has not been started yet.
|
||||||
|
(Defaults to the GHIDRA_INSTALL_DIR environment variable)
|
||||||
|
:raises ValueError: If the provided language, compiler or loader is invalid.
|
||||||
|
:raises TypeError: If the provided loader does not implement `ghidra.app.util.opinion.Loader`.
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Example:
|
||||||
|
```python
|
||||||
import pyghidra
|
import pyghidra
|
||||||
|
|
||||||
pyghidra.run_script(r"C:\input.exe", r"C:\some_ghidra_script.py")
|
pyghidra.run_script(r"C:\input.exe", r"C:\some_ghidra_script.py")
|
||||||
|
@ -163,11 +223,77 @@ This can also be done on the command line using `pyghidra`.
|
||||||
> pyghidra C:\input.exe C:\some_ghidra_script.py <CLI ARGS PASSED TO SCRIPT>
|
> pyghidra C:\input.exe C:\some_ghidra_script.py <CLI ARGS PASSED TO SCRIPT>
|
||||||
```
|
```
|
||||||
|
|
||||||
### Handling Package Name Conflicts
|
### pyghidra.launcher.PyGhidraLauncher()
|
||||||
|
JVM configuration for the classpath and vmargs may be done through a `PyGhidraLauncher`.
|
||||||
|
|
||||||
There may be some Python modules and Java packages with the same import path. When this occurs the Python module takes precedence.
|
```python
|
||||||
While jpype has its own mechanism for handling this situation, PyGhidra automatically makes the Java package accessible by allowing
|
class PyGhidraLauncher:
|
||||||
it to be imported with an underscore appended to the package name.
|
"""
|
||||||
|
Base pyghidra launcher
|
||||||
|
"""
|
||||||
|
|
||||||
|
def add_classpaths(self, *args):
|
||||||
|
"""
|
||||||
|
Add additional entries to the classpath when starting the JVM
|
||||||
|
"""
|
||||||
|
self.class_path += args
|
||||||
|
|
||||||
|
def add_vmargs(self, *args):
|
||||||
|
"""
|
||||||
|
Add additional vmargs for launching the JVM
|
||||||
|
"""
|
||||||
|
self.vm_args += args
|
||||||
|
|
||||||
|
def add_class_files(self, *args):
|
||||||
|
"""
|
||||||
|
Add additional entries to be added the classpath after Ghidra has been fully loaded.
|
||||||
|
This ensures that all of Ghidra is available so classes depending on it can be properly loaded.
|
||||||
|
"""
|
||||||
|
self.class_files += args
|
||||||
|
|
||||||
|
def start(self, **jpype_kwargs):
|
||||||
|
"""
|
||||||
|
Starts Jpype connection to Ghidra (if not already started).
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
The following `PyGhidraLauncher`s are available:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class HeadlessPyGhidraLauncher(PyGhidraLauncher):
|
||||||
|
"""
|
||||||
|
Headless pyghidra launcher
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
```python
|
||||||
|
class DeferredPyGhidraLauncher(PyGhidraLauncher):
|
||||||
|
"""
|
||||||
|
PyGhidraLauncher which allows full Ghidra initialization to be deferred.
|
||||||
|
initialize_ghidra must be called before all Ghidra classes are fully available.
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
```python
|
||||||
|
class GuiPyGhidraLauncher(PyGhidraLauncher):
|
||||||
|
"""
|
||||||
|
GUI pyghidra launcher
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Example:
|
||||||
|
```python
|
||||||
|
from pyghidra.launcher import HeadlessPyGhidraLauncher
|
||||||
|
|
||||||
|
launcher = HeadlessPyGhidraLauncher()
|
||||||
|
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
|
||||||
|
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
|
||||||
|
launcher.start()
|
||||||
|
```
|
||||||
|
|
||||||
|
## Handling Package Name Conflicts
|
||||||
|
There may be some Python modules and Java packages with the same import path. When this occurs the
|
||||||
|
Python module takes precedence. While JPype has its own mechanism for handling this situation,
|
||||||
|
PyGhidra automatically makes the Java package accessible by allowing it to be imported with an
|
||||||
|
underscore appended to the package name:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import pdb # imports Python's pdb
|
import pdb # imports Python's pdb
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue