mirror of
https://github.com/NationalSecurityAgency/ghidra.git
synced 2025-10-03 17:59:46 +02:00
GP-5018: Some updated PyGhidra docs
This commit is contained in:
parent
7fbf64ea70
commit
66a43cd6ed
2 changed files with 223 additions and 87 deletions
|
@ -1 +1,11 @@
|
|||
# PyGhidra
|
||||
|
||||
This module provides the following capabilities:
|
||||
* The [PyGhidra Python library](src/main/py/README.md) and its dependencies.
|
||||
* A [Plugin](src/main/java/ghidra/pyghidra/PyGhidraPlugin.java) that provides a CPython interpreter.
|
||||
* A [ScriptProvider](src/main/java/ghidra/pyghidra/PyGhidraScriptProvider.java) capable of running
|
||||
GhidraScripts written in native CPython 3.
|
||||
* An [interactive python script](support/pyghidra_launcher.py) that Ghidra uses to install
|
||||
and launch PyGhidra. This script handles
|
||||
[virtual environments](https://docs.python.org/3/tutorial/venv.html) and
|
||||
[externally managed environments](https://packaging.python.org/en/latest/specifications/externally-managed-environments/).
|
|
@ -1,22 +1,59 @@
|
|||
# PyGhidra
|
||||
|
||||
PyGhidra is a Python library that provides direct access to the Ghidra API within a native CPython interpreter using [jpype](https://jpype.readthedocs.io/en/latest). As well, PyGhidra contains some conveniences for setting up analysis on a given sample and running a Ghidra script locally. It also contains a Ghidra plugin to allow the use of CPython from the Ghidra user interface.
|
||||
The PyGhidra Python library, originally developed by the
|
||||
[Department of Defense Cyber Crime Center (DC3)](https://www.dc3.mil) under the name "Pyhidra", is a
|
||||
Python library that provides direct access to the Ghidra API within a native CPython 3 interpreter
|
||||
using [JPype](https://jpype.readthedocs.io/en/latest). PyGhidra contains some conveniences for
|
||||
setting up analysis on a given sample and running a Ghidra script locally. It also contains a Ghidra
|
||||
plugin to allow the use of CPython 3 from the Ghidra GUI.
|
||||
|
||||
PyGhidra was initially developed for use with Dragodis and is designed to be installable without requiring Java or Ghidra. This allows other Python projects
|
||||
have PyGhidra as a dependency and provide optional Ghidra functionality without requiring all users to install Java and Ghidra. It is recommended to recommend that users set the `GHIDRA_INSTALL_DIR` environment variable to simplify locating Ghidra.
|
||||
## Installation and Setup
|
||||
Ghidra provides an out-of-the box integraton with the PyGhidra Python library which makes
|
||||
installation and usage fairly straighforward. This enables the Ghidra GUI and headless Ghidra to run
|
||||
GhidraScript's written in native CPython 3, as well as interact with the Ghidra GUI through a
|
||||
built-in REPL. To launch Ghidra in PyGhidra-mode, see Ghidra's latest
|
||||
[Installation Guide](https://github.com/NationalSecurityAgency/ghidra/blob/master/GhidraDocs/InstallationGuide.md#pyghidra-mode).
|
||||
|
||||
It is also possible (and encouraged!) to use PyGhidra as a standalone Python library for usage
|
||||
in reverse engineering workflows where Ghidra may be one of many components involved. The following
|
||||
instructions in this document focus on this type of usage.
|
||||
|
||||
## Usage
|
||||
To install the PyGhidra Python library:
|
||||
1. Download and install
|
||||
[Ghidra 11.3 or later](https://github.com/NationalSecurityAgency/ghidra/releases) to a desired
|
||||
location.
|
||||
2. Set the `GHIDRA_INSTALL_DIR` environment variable to point to the directory where Ghidra is
|
||||
installed.
|
||||
3. Install PyGhidra:
|
||||
* Online: `pip install pyghidra`
|
||||
* Offline: `python3 -m pip install --no-index -f
|
||||
<GhidraInstallDir>/Ghidra/Features/PyGhidra/pypkg/dist pyghidra`
|
||||
|
||||
## API
|
||||
The current version of PyGhidra inherits an API from the original "Pyhidra" project that provides an
|
||||
excellent starting point for interacting with a Ghidra installation. __NOTE:__ These functions are
|
||||
subject to change in the future as more thought and feedback is collected on PyGhidra's role in the
|
||||
greater Ghidra ecosystem:
|
||||
|
||||
### Raw Connection
|
||||
### pyghidra.start()
|
||||
To get a raw connection to Ghidra use the `start()` function. This will setup a JPype connection and
|
||||
initialize Ghidra in headless mode, which will allow you to directly import `ghidra` and `java`.
|
||||
|
||||
To get a raw connection to Ghidra use the `start()` function.
|
||||
This will setup a Jpype connection and initialize Ghidra in headless mode,
|
||||
which will allow you to directly import `ghidra` and `java`.
|
||||
__NOTE:__ No projects or programs get setup in this mode.
|
||||
|
||||
*NOTE: No projects or programs get setup in this mode.*
|
||||
```python
|
||||
def start(verbose=False, *, install_dir: Path = None) -> "PyGhidraLauncher":
|
||||
"""
|
||||
Starts the JVM and fully initializes Ghidra in Headless mode.
|
||||
|
||||
:param verbose: Enable verbose output during JVM startup (Defaults to False)
|
||||
:param install_dir: The path to the Ghidra installation directory.
|
||||
(Defaults to the GHIDRA_INSTALL_DIR environment variable)
|
||||
:return: The PhyidraLauncher used to start the JVM
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example:
|
||||
```python
|
||||
import pyghidra
|
||||
pyghidra.start()
|
||||
|
@ -30,78 +67,63 @@ from java.lang import String
|
|||
# do things
|
||||
```
|
||||
|
||||
### Customizing Java and Ghidra initialization
|
||||
|
||||
JVM configuration for the classpath and vmargs may be done through a `PyGhidraLauncher`.
|
||||
### pyghidra.started()
|
||||
To check to see if PyGhidra has been started, use the `started()` function.
|
||||
|
||||
```python
|
||||
from pyghidra.launcher import HeadlessPyGhidraLauncher
|
||||
|
||||
launcher = HeadlessPyGhidraLauncher()
|
||||
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
|
||||
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
|
||||
launcher.start()
|
||||
def started() -> bool:
|
||||
"""
|
||||
Whether the PyGhidraLauncher has already started.
|
||||
"""
|
||||
```
|
||||
|
||||
### Registering an Entry Point
|
||||
|
||||
The `PyGhidraLauncher` can also be configured through the use of a registered entry point on your own python project.
|
||||
This is useful for installing your own Ghidra plugin which uses PyGhidra and self-compiles.
|
||||
|
||||
First create an [entry_point](https://setuptools.pypa.io/en/latest/userguide/entry_point.html) for `pyghidra.setup`
|
||||
pointing to a single argument function which accepts the launcher instance.
|
||||
|
||||
#### Example:
|
||||
```python
|
||||
# setup.py
|
||||
from setuptools import setup
|
||||
|
||||
setup(
|
||||
# ...,
|
||||
entry_points={
|
||||
'pyghidra.setup': [
|
||||
'acme_plugin = acme.ghidra_plugin.install:setup',
|
||||
]
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
Then we create the target function.
|
||||
This function will be called every time a user starts a PyGhidra launcher.
|
||||
In the same fashion, another entry point `pyghidra.pre_launch` may be registered and will be called after Ghidra and all
|
||||
plugins have been loaded.
|
||||
|
||||
```python
|
||||
# acme/ghidra_plugin/install.py
|
||||
from pathlib import Path
|
||||
import pyghidra
|
||||
|
||||
def setup(launcher):
|
||||
"""
|
||||
Run by PyGhidra launcher to install our plugin.
|
||||
"""
|
||||
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
|
||||
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
|
||||
|
||||
# Install our plugin.
|
||||
source_path = Path(__file__).parent / "java" / "plugin" # path to uncompiled .java code
|
||||
details = pyghidra.ExtensionDetails(
|
||||
name="acme_plugin",
|
||||
description="My Cool Plugin",
|
||||
author="acme",
|
||||
plugin_version="1.2",
|
||||
)
|
||||
launcher.install_plugin(source_path, details) # install plugin (if not already)
|
||||
if pyghidra.started():
|
||||
...
|
||||
```
|
||||
|
||||
|
||||
### Analyze a File
|
||||
|
||||
To have PyGhidra setup a binary file for you, use the `open_program()` function.
|
||||
This will setup a Ghidra project and import the given binary file as a program for you.
|
||||
### pyghidra.open_program()
|
||||
To have PyGhidra setup a binary file for you, use the `open_program()` function. This will setup a
|
||||
Ghidra project and import the given binary file as a program for you.
|
||||
|
||||
Again, this will also allow you to import `ghidra` and `java` to perform more advanced processing.
|
||||
|
||||
```python
|
||||
def open_program(
|
||||
binary_path: Union[str, Path],
|
||||
project_location: Union[str, Path] = None,
|
||||
project_name: str = None,
|
||||
analyze=True,
|
||||
language: str = None,
|
||||
compiler: str = None,
|
||||
loader: Union[str, JClass] = None
|
||||
) -> ContextManager["FlatProgramAPI"]: # type: ignore
|
||||
"""
|
||||
Opens given binary path in Ghidra and returns FlatProgramAPI object.
|
||||
|
||||
:param binary_path: Path to binary file, may be None.
|
||||
:param project_location: Location of Ghidra project to open/create.
|
||||
(Defaults to same directory as binary file)
|
||||
:param project_name: Name of Ghidra project to open/create.
|
||||
(Defaults to name of binary file suffixed with "_ghidra")
|
||||
:param analyze: Whether to run analysis before returning.
|
||||
:param language: The LanguageID to use for the program.
|
||||
(Defaults to Ghidra's detected LanguageID)
|
||||
:param compiler: The CompilerSpecID to use for the program. Requires a provided language.
|
||||
(Defaults to the Language's default compiler)
|
||||
:param loader: The `ghidra.app.util.opinion.Loader` class to use when importing the program.
|
||||
This may be either a Java class or its path. (Defaults to None)
|
||||
:return: A Ghidra FlatProgramAPI object.
|
||||
:raises ValueError: If the provided language, compiler or loader is invalid.
|
||||
:raises TypeError: If the provided loader does not implement `ghidra.app.util.opinion.Loader`.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example:
|
||||
|
||||
```python
|
||||
import pyghidra
|
||||
|
||||
|
@ -113,11 +135,12 @@ with pyghidra.open_program("binary_file.exe") as flat_api:
|
|||
# We are also free to import ghidra while in this context to do more advanced things.
|
||||
from ghidra.app.decompiler.flatapi import FlatDecompilerAPI
|
||||
decomp_api = FlatDecompilerAPI(flat_api)
|
||||
# ...
|
||||
...
|
||||
decomp_api.dispose()
|
||||
```
|
||||
|
||||
By default, PyGhidra will run analysis for you. If you would like to do this yourself, set `analyze` to `False`.
|
||||
By default, PyGhidra will run analysis for you. If you would like to do this yourself, set `analyze`
|
||||
to `False`.
|
||||
|
||||
```python
|
||||
import pyghidra
|
||||
|
@ -130,28 +153,65 @@ with pyghidra.open_program("binary_file.exe", analyze=False) as flat_api:
|
|||
flat_api.analyzeAll(program)
|
||||
```
|
||||
|
||||
|
||||
The `open_program()` function can also accept optional arguments to control the project name and location that gets created.
|
||||
(Helpful for opening up a sample in an already existing project.)
|
||||
The `open_program()` function can also accept optional arguments to control the project name and
|
||||
location that gets created (helpful for opening up a sample in an already existing project).
|
||||
|
||||
```python
|
||||
import pyghidra
|
||||
|
||||
with pyghidra.open_program("binary_file.exe", project_name="EXAM_231", project_location=r"C:\exams\231") as flat_api:
|
||||
with pyghidra.open_program("binary_file.exe", project_name="MyProject", project_location=r"C:\projects") as flat_api:
|
||||
...
|
||||
```
|
||||
|
||||
|
||||
### Run a Script
|
||||
|
||||
PyGhidra can also be used to run an existing Ghidra Python script directly in your native python interpreter
|
||||
using the `run_script()` command.
|
||||
However, while you can technically run an existing Ghidra script unmodified, you may
|
||||
run into issues due to differences between Jython 2 and CPython 3.
|
||||
Therefore, some modification to the script may be needed.
|
||||
### pyghidra.run_script()
|
||||
PyGhidra can also be used to run an existing Ghidra Python script directly in your native CPython
|
||||
interpreter using the `run_script()` function. However, while you can technically run an existing
|
||||
Ghidra script unmodified, you may run into issues due to differences between Jython 2 and
|
||||
CPython 3/JPype. Therefore, some modification to the script may be needed.
|
||||
|
||||
```python
|
||||
def run_script(
|
||||
binary_path: Optional[Union[str, Path]],
|
||||
script_path: Union[str, Path],
|
||||
project_location: Union[str, Path] = None,
|
||||
project_name: str = None,
|
||||
script_args: List[str] = None,
|
||||
verbose=False,
|
||||
analyze=True,
|
||||
lang: str = None,
|
||||
compiler: str = None,
|
||||
loader: Union[str, JClass] = None,
|
||||
*,
|
||||
install_dir: Path = None
|
||||
):
|
||||
"""
|
||||
Runs a given script on a given binary path.
|
||||
|
||||
:param binary_path: Path to binary file, may be None.
|
||||
:param script_path: Path to script to run.
|
||||
:param project_location: Location of Ghidra project to open/create.
|
||||
(Defaults to same directory as binary file if None)
|
||||
:param project_name: Name of Ghidra project to open/create.
|
||||
(Defaults to name of binary file suffixed with "_ghidra" if None)
|
||||
:param script_args: Command line arguments to pass to script.
|
||||
:param verbose: Enable verbose output during Ghidra initialization.
|
||||
:param analyze: Whether to run analysis, if a binary_path is provided, before running the script.
|
||||
:param lang: The LanguageID to use for the program.
|
||||
(Defaults to Ghidra's detected LanguageID)
|
||||
:param compiler: The CompilerSpecID to use for the program. Requires a provided language.
|
||||
(Defaults to the Language's default compiler)
|
||||
:param loader: The `ghidra.app.util.opinion.Loader` class to use when importing the program.
|
||||
This may be either a Java class or its path. (Defaults to None)
|
||||
:param install_dir: The path to the Ghidra installation directory. This parameter is only
|
||||
used if Ghidra has not been started yet.
|
||||
(Defaults to the GHIDRA_INSTALL_DIR environment variable)
|
||||
:raises ValueError: If the provided language, compiler or loader is invalid.
|
||||
:raises TypeError: If the provided loader does not implement `ghidra.app.util.opinion.Loader`.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example:
|
||||
```python
|
||||
import pyghidra
|
||||
|
||||
pyghidra.run_script(r"C:\input.exe", r"C:\some_ghidra_script.py")
|
||||
|
@ -163,11 +223,77 @@ This can also be done on the command line using `pyghidra`.
|
|||
> pyghidra C:\input.exe C:\some_ghidra_script.py <CLI ARGS PASSED TO SCRIPT>
|
||||
```
|
||||
|
||||
### Handling Package Name Conflicts
|
||||
### pyghidra.launcher.PyGhidraLauncher()
|
||||
JVM configuration for the classpath and vmargs may be done through a `PyGhidraLauncher`.
|
||||
|
||||
There may be some Python modules and Java packages with the same import path. When this occurs the Python module takes precedence.
|
||||
While jpype has its own mechanism for handling this situation, PyGhidra automatically makes the Java package accessible by allowing
|
||||
it to be imported with an underscore appended to the package name.
|
||||
```python
|
||||
class PyGhidraLauncher:
|
||||
"""
|
||||
Base pyghidra launcher
|
||||
"""
|
||||
|
||||
def add_classpaths(self, *args):
|
||||
"""
|
||||
Add additional entries to the classpath when starting the JVM
|
||||
"""
|
||||
self.class_path += args
|
||||
|
||||
def add_vmargs(self, *args):
|
||||
"""
|
||||
Add additional vmargs for launching the JVM
|
||||
"""
|
||||
self.vm_args += args
|
||||
|
||||
def add_class_files(self, *args):
|
||||
"""
|
||||
Add additional entries to be added the classpath after Ghidra has been fully loaded.
|
||||
This ensures that all of Ghidra is available so classes depending on it can be properly loaded.
|
||||
"""
|
||||
self.class_files += args
|
||||
|
||||
def start(self, **jpype_kwargs):
|
||||
"""
|
||||
Starts Jpype connection to Ghidra (if not already started).
|
||||
"""
|
||||
```
|
||||
|
||||
The following `PyGhidraLauncher`s are available:
|
||||
|
||||
```python
|
||||
class HeadlessPyGhidraLauncher(PyGhidraLauncher):
|
||||
"""
|
||||
Headless pyghidra launcher
|
||||
"""
|
||||
```
|
||||
```python
|
||||
class DeferredPyGhidraLauncher(PyGhidraLauncher):
|
||||
"""
|
||||
PyGhidraLauncher which allows full Ghidra initialization to be deferred.
|
||||
initialize_ghidra must be called before all Ghidra classes are fully available.
|
||||
"""
|
||||
```
|
||||
```python
|
||||
class GuiPyGhidraLauncher(PyGhidraLauncher):
|
||||
"""
|
||||
GUI pyghidra launcher
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example:
|
||||
```python
|
||||
from pyghidra.launcher import HeadlessPyGhidraLauncher
|
||||
|
||||
launcher = HeadlessPyGhidraLauncher()
|
||||
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
|
||||
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
|
||||
launcher.start()
|
||||
```
|
||||
|
||||
## Handling Package Name Conflicts
|
||||
There may be some Python modules and Java packages with the same import path. When this occurs the
|
||||
Python module takes precedence. While JPype has its own mechanism for handling this situation,
|
||||
PyGhidra automatically makes the Java package accessible by allowing it to be imported with an
|
||||
underscore appended to the package name:
|
||||
|
||||
```python
|
||||
import pdb # imports Python's pdb
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue