GP-326: never say die

GP-326: recompiling to htmnl
GP-326: recompiling to htmnl
GP-326: last?
GP-326: getting there
GP-326: roll along
GP-326: rolling along
GP-326: test fix
GP-326: miscellaneous post-review fixes
GP-326: complicated stuff
GP-326: more simple stuff
GP-326: navhead fix
GP-326: better docs
GP-326: html for md
GP-326: html for md
GP-326: tutorial edits
GP-326: tutorial edits
GP-326: re-arranging docs
GP-326: from review
GP-326: adding a debugger
GP-326: docs
GP-326: using TestResources - tests pass
GP-326: working tests
GP-326: most cmd/meth tests working
GP-326: cmd tests pass
GP-326: passes thru putmem
GP-326: one test running
GP-326: better startup logic
GP-326: first pass tests
GP-326: misc cleanup
GP-326: cleaner startup
GP-326: cleanup
GP-326: fixes for crash dump
GP-326: util cleanup
GP-326: objects cont.
GP-326: first pass at objects
GP-326: some cleanup
GP-326: regions
GP-326: sections
GP-326: modules
GP-326: alt launchers
GP-326: symbols
GP-326: memory
GP-326: stack frame - regs + locals
GP-326: frames
GP-326: threads
GP-326: better start sequence
GP-326: working launcher
GP-326: util.version
GP-326: arch
This commit is contained in:
d-millar 2025-01-08 13:16:34 -05:00
parent 860c754484
commit d5df1c16bb
28 changed files with 4816 additions and 1 deletions

View file

@ -0,0 +1 @@
# Debugger-agent-drgn

View file

@ -0,0 +1,20 @@
/* ###
* IP: GHIDRA
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
apply from: "$rootProject.projectDir/gradle/distributableGhidraModule.gradle"
apply from: "$rootProject.projectDir/gradle/hasPythonPackage.gradle"
apply plugin: 'eclipse'
eclipse.project.name = 'Debug Debugger-agent-drgn'

View file

@ -0,0 +1,11 @@
##VERSION: 2.0
##MODULE IP: Apache License 2.0
##MODULE IP: Apache License 2.0 with LLVM Exceptions
Module.manifest||GHIDRA||||END|
README.md||GHIDRA||||END|
build.gradle||GHIDRA||||END|
src/main/py/LICENSE||GHIDRA||||END|
src/main/py/MANIFEST.in||GHIDRA||||END|
src/main/py/README.md||GHIDRA||||END|
src/main/py/pyproject.toml||GHIDRA||||END|
src/main/py/src/ghidradrgn/schema.xml||GHIDRA||||END|

View file

@ -0,0 +1,32 @@
#!/usr/bin/env bash
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
#@title drgn-core
#@desc <html><body width="300px">
#@desc <h3>Launch with <tt>drgn-core</tt></h3>
#@desc <p>
#@desc This will attach to an existing core dump using <tt>drgn</tt>.
#@desc For setup instructions, press <b>F1</b>.
#@desc </p>
#@desc </body></html>
#@menu-group drgn
#@icon icon.debugger
#@help TraceRmiLauncherServicePlugin#drgn-core
#@env OPT_TARGET_IMG:file!="" "Core dump" "The target core dump"
export OPT_TARGET_KIND="coredump"
drgn -c "$OPT_TARGET_IMG" ../support/local-drgn.py

View file

@ -0,0 +1,31 @@
#!/usr/bin/env bash
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
#@title drgn-kernel
#@desc <html><body width="300px">
#@desc <h3>Launch with <tt>drgn-kernel</tt></h3>
#@desc <p>
#@desc This will attach to the local machine's kernel using <tt>drgn</tt>.
#@desc For setup instructions, press <b>F1</b>.
#@desc </p>
#@desc </body></html>
#@menu-group drgn
#@icon icon.debugger
#@help TraceRmiLauncherServicePlugin#drgn-kernel
export OPT_TARGET_KIND="kernel"
sudo -E drgn ../support/local-drgn.py

View file

@ -0,0 +1,34 @@
#!/usr/bin/env bash
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
#@title drgn
#@desc <html><body width="300px">
#@desc <h3>Launch with <tt>drgn</tt></h3>
#@desc <p>
#@desc This will attach to a target running on the local machine using <tt>drgn</tt>.
#@desc For setup instructions, press <b>F1</b>.
#@desc </p>
#@desc </body></html>
#@menu-group drgn
#@icon icon.debugger
#@help TraceRmiLauncherServicePlugin#drgn
#@env OPT_TARGET_PID:int=44068 "PID" "The target's process id"
export OPT_TARGET_KIND="user"
# sudo -E drgn -p "$OPT_TARGET_PID" ../support/local-drgn.py
# or 'echo 0 > /proc/sys/kernel/yama/ptrace_scope'
drgn -p "$OPT_TARGET_PID" ../support/local-drgn.py

View file

@ -0,0 +1,57 @@
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
# From drgn:
# EASY-INSTALL-ENTRY-SCRIPT: 'drgn==0.0.24','console_scripts','drgn'
import os
import re
import sys
import drgn.cli
home = os.getenv('GHIDRA_HOME')
if os.path.isdir(f'{home}/ghidra/.git'):
sys.path.append(
f'{home}/ghidra/Ghidra/Debug/Debugger-agent-drgn/build/pypkg/src')
sys.path.append(
f'{home}/ghidra/Ghidra/Debug/Debugger-rmi-trace/build/pypkg/src')
elif os.path.isdir(f'{home}/.git'):
sys.path.append(
f'{home}/Ghidra/Debug/Debugger-agent-drgn/build/pypkg/src')
sys.path.append(
f'{home}/Ghidra/Debug/Debugger-rmi-trace/build/pypkg/src')
else:
sys.path.append(
f'{home}/Ghidra/Debug/Debugger-agent-drgn/pypkg/src')
sys.path.append(f'{home}/Ghidra/Debug/Debugger-rmi-trace/pypkg/src')
def main():
from ghidradrgn import commands as cmd
cmd.ghidra_trace_connect(address=os.getenv('GHIDRA_TRACE_RMI_ADDR'))
cmd.ghidra_trace_create(start_trace=True)
cmd.ghidra_trace_txstart()
cmd.ghidra_trace_put_all()
cmd.ghidra_trace_txcommit()
cmd.ghidra_trace_activate()
drgn.cli.run_interactive(cmd.prog)
if __name__ == '__main__':
main()

View file

@ -0,0 +1,11 @@
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View file

@ -0,0 +1 @@
include src/ghidradrgn/schema.xml

View file

@ -0,0 +1,3 @@
# Ghidra Trace RMI for drgn
Package for connecting drgn to Ghidra via Trace RMI.

View file

@ -0,0 +1,25 @@
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = "ghidradrgn"
version = "11.3"
authors = [
{ name="Ghidra Development Team" },
]
description = "Ghidra's Plugin for drgn"
readme = "README.md"
requires-python = ">=3.7"
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
]
dependencies = [
"ghidratrace==11.3",
]
[project.urls]
"Homepage" = "https://github.com/NationalSecurityAgency/ghidra"
"Bug Tracker" = "https://github.com/NationalSecurityAgency/ghidra/issues"

View file

@ -0,0 +1,16 @@
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
from . import util, commands

View file

@ -0,0 +1,209 @@
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
from ghidratrace.client import Address, RegVal
import drgn
from . import util
# NOTE: This map is derived from the ldefs using a script
language_map = {
'AARCH64': ['AARCH64:BE:64:v8A', 'AARCH64:LE:64:AppleSilicon', 'AARCH64:LE:64:v8A'],
'ARM': ['ARM:BE:32:v8', 'ARM:BE:32:v8T', 'ARM:LE:32:v8', 'ARM:LE:32:v8T'],
'PPC64': ['PowerPC:BE:64:4xx', 'PowerPC:LE:64:4xx'],
'S390': [],
'S390X': [],
'I386': ['x86:LE:32:default'],
'X86_64': ['x86:LE:64:default'],
'UNKNOWN': ['DATA:LE:64:default', 'DATA:LE:64:default'],
}
data64_compiler_map = {
None: 'pointer64',
}
default_compiler_map = {
'Language.C': 'default',
}
x86_compiler_map = {
'Language.C': 'gcc',
}
compiler_map = {
'DATA:BE:64:': data64_compiler_map,
'DATA:LE:64:': data64_compiler_map,
'x86:LE:32:': x86_compiler_map,
'x86:LE:64:': x86_compiler_map,
'AARCH64:LE:64:': default_compiler_map,
'ARM:BE:32:': default_compiler_map,
'ARM:LE:32:': default_compiler_map,
'PowerPC:BE:64:': default_compiler_map,
'PowerPC:LE:64:': default_compiler_map,
}
def get_arch():
platform = drgn.host_platform
return platform.arch.name
def get_endian():
parm = util.get_convenience_variable('endian')
if parm != 'auto':
return parm
platform = drgn.host_platform
order = platform.flags.IS_LITTLE_ENDIAN
if order.value > 0:
return 'little'
else:
return 'big'
def get_size():
parm = util.get_convenience_variable('size')
if parm != 'auto':
return parm
platform = drgn.host_platform
order = platform.flags.IS_64_BIT
if order.value > 0:
return '64'
else:
return '32'
def get_osabi():
return "Language.C"
def compute_ghidra_language():
# First, check if the parameter is set
lang = util.get_convenience_variable('ghidra-language')
if lang != 'auto':
return lang
# Get the list of possible languages for the arch. We'll need to sift
# through them by endian and probably prefer default/simpler variants. The
# heuristic for "simpler" will be 'default' then shortest variant id.
arch = get_arch()
endian = get_endian()
sz = get_size()
lebe = ':BE:' if endian == 'big' else ':LE:'
if not arch in language_map:
return 'DATA' + lebe + sz +':default'
langs = language_map[arch]
matched_endian = sorted(
(l for l in langs if lebe in l),
key=lambda l: 0 if l.endswith(':default') else len(l)
)
if len(matched_endian) > 0:
return matched_endian[0]
# NOTE: I'm disinclined to fall back to a language match with wrong endian.
return 'DATA' + lebe + sz + ':default'
def compute_ghidra_compiler(lang):
# First, check if the parameter is set
comp = util.get_convenience_variable('ghidra-compiler')
if comp != 'auto':
return comp
# Check if the selected lang has specific compiler recommendations
matched_lang = sorted(
(l for l in compiler_map if l in lang),
# key=lambda l: compiler_map[l]
)
if len(matched_lang) == 0:
print(f"{lang} not found in compiler map - using default compiler")
return 'default'
comp_map = compiler_map[matched_lang[0]]
if comp_map == data64_compiler_map:
print(f"Using the DATA64 compiler map")
osabi = get_osabi()
if osabi in comp_map:
return comp_map[osabi]
if lang.startswith("X86:"):
print(f"{osabi} not found in compiler map - using gcc")
return 'gcc'
if None in comp_map:
return comp_map[None]
print(f"{osabi} not found in compiler map - using default compiler")
return 'default'
def compute_ghidra_lcsp():
lang = compute_ghidra_language()
comp = compute_ghidra_compiler(lang)
return lang, comp
class DefaultMemoryMapper(object):
def __init__(self, defaultSpace):
self.defaultSpace = defaultSpace
def map(self, proc: drgn.Program, offset: int):
space = self.defaultSpace
return self.defaultSpace, Address(space, offset)
def map_back(self, proc: drgn.Program, address: Address) -> int:
if address.space == self.defaultSpace:
return address.offset
raise ValueError(
f"Address {address} is not in process {proc}")
DEFAULT_MEMORY_MAPPER = DefaultMemoryMapper('ram')
memory_mappers = {}
def compute_memory_mapper(lang):
if not lang in memory_mappers:
return DEFAULT_MEMORY_MAPPER
return memory_mappers[lang]
class DefaultRegisterMapper(object):
def __init__(self, byte_order):
if not byte_order in ['big', 'little']:
raise ValueError("Invalid byte_order: {}".format(byte_order))
self.byte_order = byte_order
self.union_winners = {}
def map_name(self, proc, name):
return name
def map_value(self, proc, name, value):
return RegVal(self.map_name(proc, name), value)
def map_name_back(self, proc, name):
return name
def map_value_back(self, proc, name, value):
return RegVal(self.map_name_back(proc, name), value)
DEFAULT_BE_REGISTER_MAPPER = DefaultRegisterMapper('big')
DEFAULT_LE_REGISTER_MAPPER = DefaultRegisterMapper('little')
def compute_register_mapper(lang):
if ':BE:' in lang:
return DEFAULT_BE_REGISTER_MAPPER
else:
return DEFAULT_LE_REGISTER_MAPPER

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,249 @@
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
import threading
import time
import drgn
from . import commands, util
ALL_EVENTS = 0xFFFF
class HookState(object):
__slots__ = ('installed', 'mem_catchpoint')
def __init__(self):
self.installed = False
self.mem_catchpoint = None
class ProcessState(object):
__slots__ = ('first', 'regions', 'modules', 'threads',
'breaks', 'watches', 'visited')
def __init__(self):
self.first = True
# For things we can detect changes to between stops
self.regions = False
self.modules = False
self.threads = False
self.breaks = False
self.watches = False
# For frames and threads that have already been synced since last stop
self.visited = set()
def record(self, description=None):
first = self.first
self.first = False
if description is not None:
commands.STATE.trace.snapshot(description)
if first:
commands.put_processes()
commands.put_environment()
if self.threads:
commands.put_threads()
self.threads = False
nthrd = util.selected_thread()
if nthrd is not None:
if first or nthrd not in self.visited:
commands.put_frames()
self.visited.add(nthrd)
level = util.selected_frame()
hashable_frame = (nthrd, level)
if first or hashable_frame not in self.visited:
commands.putreg()
try:
commands.putmem(commands.get_pc(), 1, True, True)
except BaseException as e:
print(f"Couldn't record page with PC: {e}")
try:
commands.putmem(commands.get_sp(), 1, True, True)
except BaseException as e:
print(f"Couldn't record page with SP: {e}")
self.visited.add(hashable_frame)
if first or self.regions or self.modules:
# Sections, memory syscalls, or stack allocations
commands.put_regions()
self.regions = False
if first or self.modules:
commands.put_modules()
self.modules = False
def record_continued(self):
commands.put_processes()
commands.put_threads()
def record_exited(self, exit_code):
nproc = util.selected_process()
ipath = commands.PROCESS_PATTERN.format(procnum=nproc)
procobj = commands.STATE.trace.proxy_object_path(ipath)
procobj.set_value('Exit Code', exit_code)
procobj.set_value('State', 'TERMINATED')
HOOK_STATE = HookState()
PROC_STATE = {}
def on_new_process(event):
trace = commands.STATE.trace
if trace is None:
return
with commands.STATE.client.batch():
with trace.open_tx("New Process {}".format(event.process.num)):
commands.put_processes() # TODO: Could put just the one....
def on_process_selected():
nproc = util.selected_process()
if nproc not in PROC_STATE:
return
trace = commands.STATE.trace
if trace is None:
return
with commands.STATE.client.batch():
with trace.open_tx("Process {} selected".format(nproc)):
PROC_STATE[nproc].record()
commands.activate()
def on_new_thread(event):
nproc = util.selected_process()
if nproc not in PROC_STATE:
return
PROC_STATE[nproc].threads = True
def on_thread_selected():
nproc = util.selected_process()
if nproc not in PROC_STATE:
return
trace = commands.STATE.trace
if trace is None:
return
nthrd = util.selected_thread()
with commands.STATE.client.batch():
with trace.open_tx("Thread {}.{} selected".format(nproc, nthrd)):
PROC_STATE[nproc].record()
commands.put_threads()
commands.activate()
def on_frame_selected():
nproc = util.selected_process()
if nproc not in PROC_STATE:
return
trace = commands.STATE.trace
if trace is None:
return
nthrd = util.selected_thread()
level = util.selected_frame()
with commands.STATE.client.batch():
with trace.open_tx("Frame {}.{}.{} selected".format(nproc, nthrd, level)):
PROC_STATE[nproc].record()
commands.put_threads()
commands.put_frames()
commands.activate()
def on_memory_changed(event):
nproc = util.get_process()
if nproc not in PROC_STATE:
return
trace = commands.STATE.trace
if trace is None:
return
with commands.STATE.client.batch():
with trace.open_tx("Memory *0x{:08x} changed".format(event.address)):
commands.put_bytes(event.address, event.address + event.length,
pages=False, is_mi=False, result=None)
def on_register_changed(event):
nproc = util.get_process()
if nproc not in PROC_STATE:
return
trace = commands.STATE.trace
if trace is None:
return
with commands.STATE.client.batch():
with trace.open_tx("Register {} changed".format(event.regnum)):
commands.putreg()
def on_cont(event):
nproc = util.selected_process()
if nproc not in PROC_STATE:
return
trace = commands.STATE.trace
if trace is None:
return
state = PROC_STATE[nproc]
with commands.STATE.client.batch():
with trace.open_tx("Continued"):
state.record_continued()
def on_stop(event):
nproc = util.selected_process()
if nproc not in PROC_STATE:
PROC_STATE[nproc] = ProcessState()
trace = commands.STATE.trace
if trace is None:
print("no trace")
return
state = PROC_STATE[nproc]
state.visited.clear()
with commands.STATE.client.batch():
with trace.open_tx("Stopped"):
state.record("Stopped")
commands.put_threads()
commands.put_frames()
commands.activate()
def modules_changed():
nproc = util.selected_process()
if nproc not in PROC_STATE:
return
PROC_STATE[nproc].modules = True
def install_hooks():
if HOOK_STATE.installed:
return
HOOK_STATE.installed = True
event_thread = EventThread()
event_thread.start()
def remove_hooks():
if not HOOK_STATE.installed:
return
HOOK_STATE.installed = False
def enable_current_process():
nproc = util.selected_process()
PROC_STATE[nproc] = ProcessState()
def disable_current_process():
nproc = util.selected_process()
if nproc in PROC_STATE:
del PROC_STATE[nproc]

View file

@ -0,0 +1,388 @@
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
from concurrent.futures import Future, ThreadPoolExecutor
from contextlib import redirect_stdout
from io import StringIO
import re
import sys
import time
import drgn
import drgn.cli
from ghidratrace import sch
from ghidratrace.client import MethodRegistry, ParamDesc, Address, AddressRange
from . import util, commands, hooks
REGISTRY = MethodRegistry(ThreadPoolExecutor(
max_workers=1, thread_name_prefix='MethodRegistry'))
def extre(base, ext):
return re.compile(base.pattern + ext)
PROCESSES_PATTERN = re.compile('Processes')
PROCESS_PATTERN = extre(PROCESSES_PATTERN, '\[(?P<procnum>\\d*)\]')
ENV_PATTERN = extre(PROCESS_PATTERN, '\.Environment')
THREADS_PATTERN = extre(PROCESS_PATTERN, '\.Threads')
THREAD_PATTERN = extre(THREADS_PATTERN, '\[(?P<tnum>\\d*)\]')
STACK_PATTERN = extre(THREAD_PATTERN, '\.Stack')
FRAME_PATTERN = extre(STACK_PATTERN, '\[(?P<level>\\d*)\]')
REGS_PATTERN = extre(FRAME_PATTERN, '.Registers')
LOCALS_PATTERN = extre(FRAME_PATTERN, '.Locals')
MEMORY_PATTERN = extre(PROCESS_PATTERN, '\.Memory')
MODULES_PATTERN = extre(PROCESS_PATTERN, '\.Modules')
MODULE_PATTERN = extre(MODULES_PATTERN, '\[(?P<modbase>.*)\]')
def find_availpid_by_pattern(pattern, object, err_msg):
mat = pattern.fullmatch(object.path)
if mat is None:
raise TypeError(f"{object} is not {err_msg}")
pid = int(mat['pid'])
return pid
def find_availpid_by_obj(object):
return find_availpid_by_pattern(AVAILABLE_PATTERN, object, "an Available")
def find_proc_by_num(id):
if id != util.selected_process():
util.select_process(id)
return util.selected_process()
def find_proc_by_pattern(object, pattern, err_msg):
mat = pattern.fullmatch(object.path)
if mat is None:
raise TypeError(f"{object} is not {err_msg}")
procnum = int(mat['procnum'])
return find_proc_by_num(procnum)
def find_proc_by_obj(object):
return find_proc_by_pattern(object, PROCESS_PATTERN, "an Process")
def find_proc_by_env_obj(object):
return find_proc_by_pattern(object, ENV_PATTERN, "an Environment")
def find_proc_by_threads_obj(object):
return find_proc_by_pattern(object, THREADS_PATTERN, "a ThreadContainer")
def find_proc_by_mem_obj(object):
return find_proc_by_pattern(object, MEMORY_PATTERN, "a Memory")
def find_proc_by_modules_obj(object):
return find_proc_by_pattern(object, MODULES_PATTERN, "a ModuleContainer")
def find_thread_by_num(id):
if id != util.selected_thread():
util.select_thread(id)
return util.selected_thread()
def find_thread_by_pattern(pattern, object, err_msg):
mat = pattern.fullmatch(object.path)
if mat is None:
raise TypeError(f"{object} is not {err_msg}")
pnum = int(mat['procnum'])
tnum = int(mat['tnum'])
find_proc_by_num(pnum)
return find_thread_by_num(tnum)
def find_thread_by_obj(object):
return find_thread_by_pattern(THREAD_PATTERN, object, "a Thread")
def find_thread_by_stack_obj(object):
return find_thread_by_pattern(STACK_PATTERN, object, "a Stack")
def find_thread_by_regs_obj(object):
return find_thread_by_pattern(REGS_PATTERN, object, "a RegisterValueContainer")
def find_frame_by_level(level):
tnum = util.selected_thread()
thread = commands.prog.thread(tnum)
try:
frames = thread.stack_trace()
except Exception as e:
print(e)
return
for i,f in enumerate(frames):
if i == level:
if i != util.selected_frame():
util.select_frame(i)
return i,f
def find_frame_by_pattern(pattern, object, err_msg):
mat = pattern.fullmatch(object.path)
if mat is None:
raise TypeError(f"{object} is not {err_msg}")
pnum = int(mat['procnum'])
tnum = int(mat['tnum'])
level = int(mat['level'])
find_proc_by_num(pnum)
find_thread_by_num(tnum)
return find_frame_by_level(level)
def find_frame_by_obj(object):
return find_frame_by_pattern(FRAME_PATTERN, object, "a StackFrame")
def find_frame_by_regs_obj(object):
return find_frame_by_pattern(REGS_PATTERN, object, "a RegisterValueContainer")
def find_frame_by_locals_obj(object):
return find_frame_by_pattern(LOCALS_PATTERN, object, "a LocalsContainer")
def find_module_by_base(modbase):
for m in commands.prog.modules():
if modbase == str(hex(m.address_range[0])):
return m
def find_module_by_pattern(pattern, object, err_msg):
mat = pattern.fullmatch(object.path)
if mat is None:
raise TypeError(f"{object} is not {err_msg}")
pnum = int(mat['procnum'])
modbase = mat['modbase']
find_proc_by_num(pnum)
return find_module_by_base(modbase)
def find_module_by_obj(object):
return find_module_by_pattern(MODULE_PATTERN, object, "a Module")
shared_globals = dict()
@REGISTRY.method
def execute(cmd: str, to_string: bool=False):
"""Execute a Python3 command or script."""
if to_string:
data = StringIO()
with redirect_stdout(data):
exec(cmd, shared_globals)
return data.getvalue()
else:
exec(cmd, shared_globals)
@REGISTRY.method(action='refresh', display='Refresh Processes')
def refresh_processes(node: sch.Schema('ProcessContainer')):
"""Refresh the list of processes."""
with commands.open_tracked_tx('Refresh Processes'):
commands.ghidra_trace_put_processes()
@REGISTRY.method(action='refresh', display='Refresh Environment')
def refresh_environment(node: sch.Schema('Environment')):
"""Refresh the environment descriptors (arch, os, endian)."""
with commands.open_tracked_tx('Refresh Environment'):
commands.ghidra_trace_put_environment()
@REGISTRY.method(action='refresh', display='Refresh Threads')
def refresh_threads(node: sch.Schema('ThreadContainer')):
"""Refresh the list of threads in the process."""
with commands.open_tracked_tx('Refresh Threads'):
commands.ghidra_trace_put_threads()
# @REGISTRY.method(action='refresh', display='Refresh Symbols')
# def refresh_symbols(node: sch.Schema('SymbolContainer')):
# """Refresh the list of symbols in the process."""
# with commands.open_tracked_tx('Refresh Symbols'):
# commands.ghidra_trace_put_symbols()
@REGISTRY.method(action='show_symbol', display='Retrieve Symbols')
def retrieve_symbols(
session: sch.Schema('SymbolContainer'),
pattern: ParamDesc(str, display='Pattern')):
"""
Load the symbol set matching the pattern.
"""
with commands.open_tracked_tx('Retrieve Symbols'):
commands.put_symbols(pattern)
@REGISTRY.method(action='refresh', display='Refresh Stack')
def refresh_stack(node: sch.Schema('Stack')):
"""Refresh the backtrace for the thread."""
tnum = find_thread_by_stack_obj(node)
with commands.open_tracked_tx('Refresh Stack'):
commands.ghidra_trace_put_frames()
@REGISTRY.method(action='refresh', display='Refresh Registers')
def refresh_registers(node: sch.Schema('RegisterValueContainer')):
"""Refresh the register values for the selected frame"""
level = find_frame_by_regs_obj(node)
with commands.open_tracked_tx('Refresh Registers'):
commands.ghidra_trace_putreg()
@REGISTRY.method(action='refresh', display='Refresh Locals')
def refresh_locals(node: sch.Schema('LocalsContainer')):
"""Refresh the local values for the selected frame"""
level = find_frame_by_locals_obj(node)
with commands.open_tracked_tx('Refresh Registers'):
commands.ghidra_trace_put_locals()
@REGISTRY.method(action='refresh', display='Refresh Memory')
def refresh_mappings(node: sch.Schema('Memory')):
"""Refresh the list of memory regions for the process."""
with commands.open_tracked_tx('Refresh Memory Regions'):
commands.ghidra_trace_put_regions()
@REGISTRY.method(action='refresh', display='Refresh Modules')
def refresh_modules(node: sch.Schema('ModuleContainer')):
"""
Refresh the modules list for the process.
"""
with commands.open_tracked_tx('Refresh Modules'):
commands.ghidra_trace_put_modules()
@REGISTRY.method(action='activate')
def activate_process(process: sch.Schema('Process')):
"""Switch to the process."""
find_proc_by_obj(process)
@REGISTRY.method(action='activate')
def activate_thread(thread: sch.Schema('Thread')):
"""Switch to the thread."""
find_thread_by_obj(thread)
@REGISTRY.method(action='activate')
def activate_frame(frame: sch.Schema('StackFrame')):
"""Select the frame."""
i,f = find_frame_by_obj(frame)
util.select_frame(i)
with commands.open_tracked_tx('Refresh Stack'):
commands.ghidra_trace_put_frames()
with commands.open_tracked_tx('Refresh Registers'):
commands.ghidra_trace_putreg()
@REGISTRY.method
def read_mem(process: sch.Schema('Process'), range: AddressRange):
"""Read memory."""
# print("READ_MEM: process={}, range={}".format(process, range))
nproc = find_proc_by_obj(process)
offset_start = process.trace.memory_mapper.map_back(
nproc, Address(range.space, range.min))
with commands.open_tracked_tx('Read Memory'):
result = commands.put_bytes(
offset_start, offset_start + range.length() - 1, pages=True, display_result=False)
if result['count'] == 0:
commands.putmem_state(
offset_start, offset_start+range.length() - 1, 'error')
@REGISTRY.method(action='attach', display='Attach by pid')
def attach_pid(
processes: sch.Schema('ProcessContainer'),
pid: ParamDesc(str, display='PID')):
"""Attach the process to the given target."""
prog = drgn.Program()
prog.set_pid(int(pid))
util.selected_pid = int(pid)
util.selected_tid = prog.main_thread().tid
default_symbols = {"default": True, "main": True}
try:
prog.load_debug_info(None, **default_symbols)
except drgn.MissingDebugInfoError as e:
print(e)
#commands.ghidra_trace_start(pid)
commands.PROGRAMS[pid] = prog
commands.prog = prog
with commands.open_tracked_tx('Refresh Processes'):
commands.ghidra_trace_put_processes()
@REGISTRY.method(action='attach', display='Attach core dump')
def attach_core(
processes: sch.Schema('ProcessContainer'),
core: ParamDesc(str, display='Core dump')):
"""Attach the process to the given target."""
prog = drgn.Program()
prog.set_core_dump(core)
default_symbols = {"default": True, "main": True}
try:
prog.load_debug_info(None, **default_symbols)
except drgn.MissingDebugInfoError as e:
print(e)
util.selected_pid += 1
commands.PROGRAMS[util.selected_pid] = prog
commands.prog = prog
with commands.open_tracked_tx('Refresh Processes'):
commands.ghidra_trace_put_processes()
@REGISTRY.method(action='step_into')
def step_into(thread: sch.Schema('Thread'), n: ParamDesc(int, display='N')=1):
"""Step one instruction exactly."""
find_thread_by_obj(thread)
time.sleep(1)
hooks.on_stop(None)
# @REGISTRY.method
# def kill(process: sch.Schema('Process')):
# """Kill execution of the process."""
# commands.ghidra_trace_kill()
# @REGISTRY.method(action='resume')
# def go(process: sch.Schema('Process')):
# """Continue execution of the process."""
# util.dbg.run_async(lambda: dbg().go())
# @REGISTRY.method
# def interrupt(process: sch.Schema('Process')):
# """Interrupt the execution of the debugged program."""
# # SetInterrupt is reentrant, so bypass the thread checks
# util.dbg._protected_base._control.SetInterrupt(
# DbgEng.DEBUG_INTERRUPT_ACTIVE)

View file

@ -0,0 +1,183 @@
<context>
<schema name="DrgnRoot" canonical="yes" elementResync="NEVER" attributeResync="NEVER">
<attribute name="Processes" schema="ProcessContainer" required="yes" fixed="yes" />
<attribute name="State" schema="ANY" />
<attribute-alias from="_state" to="State" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY"/>
</schema>
<schema name="Selectable" elementResync="NEVER" attributeResync="NEVER">
<element schema="OBJECT" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="ProcessContainer" canonical="yes" elementResync="NEVER" attributeResync="NEVER">
<element schema="Process" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="Process" elementResync="NEVER" attributeResync="NEVER">
<interface name="Activatable" />
<interface name="Process" />
<interface name="Aggregate" />
<interface name="ExecutionStateful" />
<element schema="VOID" />
<attribute name="Threads" schema="ThreadContainer" required="yes" fixed="yes" />
<attribute name="Symbols" schema="SymbolContainer" required="yes" fixed="yes" />
<attribute name="Exit Code" schema="LONG" />
<attribute-alias from="_exit_code" to="Exit Code" />
<attribute name="Environment" schema="Environment" required="yes" fixed="yes" />
<attribute name="Memory" schema="Memory" required="yes" fixed="yes" />
<attribute name="Modules" schema="ModuleContainer" required="yes" fixed="yes" />
<attribute name="Handle" schema="STRING" fixed="yes" />
<attribute name="Id" schema="STRING" fixed="yes" />
<attribute name="PID" schema="LONG" hidden="yes" />
<attribute-alias from="_pid" to="PID" />
<attribute name="State" schema="EXECUTION_STATE" required="yes" hidden="yes" />
<attribute-alias from="_state" to="State" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_short_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="Environment" elementResync="NEVER" attributeResync="NEVER">
<interface name="Environment" />
<element schema="VOID" />
<attribute name="OS" schema="STRING" />
<attribute name="Arch" schema="STRING" />
<attribute name="Endian" schema="STRING" />
<attribute name="Debugger" schema="STRING" />
<attribute-alias from="_os" to="OS" />
<attribute-alias from="_arch" to="Arch" />
<attribute-alias from="_endian" to="Endian" />
<attribute-alias from="_debugger" to="Debugger" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="ModuleContainer" canonical="yes" elementResync="ONCE" attributeResync="NEVER">
<element schema="Module" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="Memory" canonical="yes" elementResync="NEVER" attributeResync="NEVER">
<interface name="Memory" />
<element schema="MemoryRegion" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="ThreadContainer" canonical="yes" elementResync="NEVER" attributeResync="NEVER">
<element schema="Thread" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="Thread" elementResync="NEVER" attributeResync="NEVER">
<interface name="Activatable" />
<interface name="Thread" />
<interface name="ExecutionStateful" />
<interface name="Aggregate" />
<element schema="VOID" />
<attribute name="Stack" schema="Stack" required="yes" fixed="yes" />
<attribute name="Environment" schema="ANY" fixed="yes" />
<attribute name="Id" schema="STRING" fixed="yes" />
<attribute name="TID" schema="LONG" />
<attribute-alias from="_tid" to="TID" />
<attribute name="State" schema="EXECUTION_STATE" required="yes" hidden="yes" />
<attribute-alias from="_state" to="State" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_short_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="Module" elementResync="NEVER" attributeResync="NEVER">
<interface name="Module" />
<element schema="VOID" />
<attribute name="Sections" schema="SectionContainer" required="yes" fixed="yes" />
<attribute name="Symbols" schema="SymbolContainer" required="yes" fixed="yes" />
<attribute name="Range" schema="RANGE" />
<attribute name="Name" schema="STRING" />
<attribute-alias from="_module_name" to="Name" />
<attribute-alias from="_range" to="Range" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute name="ToDisplayString" schema="BOOL" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="MemoryRegion" elementResync="NEVER" attributeResync="NEVER">
<interface name="MemoryRegion" />
<element schema="VOID" />
<attribute name="Object File" schema="STRING" fixed="yes" />
<attribute name="_readable" schema="BOOL" required="yes" hidden="yes" />
<attribute name="_writable" schema="BOOL" required="yes" hidden="yes" />
<attribute name="_executable" schema="BOOL" required="yes" hidden="yes" />
<attribute name="Range" schema="RANGE" required="yes" />
<attribute-alias from="_range" to="Range" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="SectionContainer" canonical="yes" elementResync="NEVER" attributeResync="NEVER">
<element schema="Section" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="Stack" canonical="yes" elementResync="NEVER" attributeResync="NEVER">
<interface name="Stack" />
<element schema="StackFrame" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="SymbolContainer" canonical="yes" elementResync="ONCE" attributeResync="NEVER">
<element schema="Symbol" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="Symbol" elementResync="NEVER" attributeResync="NEVER">
<element schema="VOID" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="StackFrame" elementResync="NEVER" attributeResync="NEVER">
<interface name="Activatable" />
<interface name="StackFrame" />
<interface name="Aggregate" />
<element schema="VOID" />
<attribute name="Function" schema="STRING" hidden="yes" />
<attribute-alias from="_function" to="Function" />
<attribute name="PC" schema="ADDRESS" required="yes" />
<attribute-alias from="_pc" to="PC" />
<attribute name="SP" schema="ADDRESS" />
<attribute name="Registers" schema="RegisterValueContainer" required="yes" fixed="yes" />
<attribute name="Locals" schema="LocalsContainer" required="yes" fixed="yes" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
<schema name="Section" elementResync="NEVER" attributeResync="NEVER">
<interface name="Section" />
<element schema="VOID" />
<attribute name="Range" schema="RANGE" />
<attribute-alias from="_range" to="Range" />
<attribute name="Offset" schema="STRING" fixed="yes" />
<attribute name="_display" schema="STRING" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="RegisterValueContainer" attributeResync="ONCE">
<interface name="RegisterContainer" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="VOID" />
</schema>
<schema name="LocalsContainer" attributeResync="ONCE">
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="Local" />
</schema>
<schema name="Local" attributeResync="ONCE">
<attribute name="Address" schema="ADDRESS" />
<attribute name="Kind" schema="ANY" hidden="yes" />
<attribute name="_order" schema="INT" hidden="yes" />
<attribute schema="ANY" />
</schema>
</context>

View file

@ -0,0 +1,115 @@
## ###
# IP: GHIDRA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##
from collections import namedtuple
import os
import re
import sys
import drgn
import drgn.cli
DrgnVersion = namedtuple('DrgnVersion', ['display', 'full'])
selected_pid = 0
selected_tid = 0
selected_level = 0
def _compute_drgn_ver():
blurb = drgn.cli.version_header()
top = blurb.split('\n')[0]
full = top.split()[1] # "drgn x.y.z"
return DrgnVersion(top, full)
DRGN_VERSION = _compute_drgn_ver()
def full_mem(self):
return Region(0, 1 << 64, 0, None, 'full memory')
def get_debugger():
return drgn
def get_target():
return commands.prog
def get_process(name):
return get_target()[name]
def selected_process():
return selected_pid
def selected_thread():
return selected_tid
def selected_frame():
return selected_level
def select_process(id: int):
global selected_pid
selected_pid = id
return selected_pid
def select_thread(id: int):
global selected_tid
selected_tid = id
return selected_tid
def select_frame(id: int):
global selected_level
selected_level = id
return selected_level
conv_map = {}
def get_convenience_variable(id):
#val = get_target().GetEnvironment().Get(id)
if id not in conv_map:
return "auto"
val = conv_map[id]
if val is None:
return "auto"
return val
def set_convenience_variable(id, value):
#env = get_target().GetEnvironment()
# return env.Set(id, value, True)
conv_map[id] = value
def escape_ansi(line):
ansi_escape = re.compile(r'(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]')
return ansi_escape.sub('', line)
def debracket(init):
val = init
val = val.replace("[", "(")
val = val.replace("]", ")")
return val

View file

@ -1043,6 +1043,74 @@ java -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=localhost:543
execution.</LI> execution.</LI>
</UL> </UL>
<H2>Drgn Launchers</H2>
<P>The following launchers uses Meta's <B>drgn</B> engine to explore various targets:</P>
<H3><A name="drgn"></A>drgn</H3>
<P>This launcher attaches to a running process via the Linux "/proc/pid" interface.</P>
<H4><A name="drgn_setup"></A>Setup</H4>
<P>You must have Meta's <B>drgn</B> installed on the local system. The default behavior
assumes you do NOT need root access to attach to a running process, i.e. it assumes you
have run the command:</P>
<UL style="list-style-type: none">
<LI>
<PRE>
echo 0 > /proc/sys/kernel/yama/ptrace_scope
</PRE>
</LI>
</UL>
<P>using root privileges at some point. Alternately, you can prepend "sudo -E"
to the drgn invocation line in "local-drgn.sh"". Note: <B>drgn</B> does not currently
support stack unwinding or register access for user-mode access to running processes.
</P>
<H4>Options</H4>
<UL>
<LI><B>PID</B>: The running process's id</LI>
</LI>
</UL>
<H3><A name="drgn-core"></A>drgn-core</H3>
<P>This launcher loads a Linux core dump.</P>
<H4><A name="drgn_core_setup"></A>Setup</H4>
<P>You must have Meta's <B>drgn</B> installed on the local system. No other setup is required.
Note: Core dumps may or may not include memory, so the Dynamic Listing may or may not be populated.
</P>
<H4>Options</H4>
<UL>
<LI><B>Core dump</B>: The core-dump file</LI>
</LI>
</UL>
<H3><A name="drgn-kernel"></A>drgn-kernel</H3>
<P>This launcher attaches to a Linux kernel via the "/proc/kcore" interface.</P>
<H4><A name="drgn_kernel_setup"></A>Setup</H4>
<P>You must have Meta's <B>drgn</B> installed on the local system. No other setup is required.
Note: requires root access - you will be prompted for a password in the Terminal.
</P>
<H4>Options</H4>
<UL>
<LI><B>None</LI>
</LI>
</UL>
<H2>Development and Diagnostic Launchers</H2> <H2>Development and Diagnostic Launchers</H2>
<P>We currently provide one launcher for Trace RMI API exploration and development:</P> <P>We currently provide one launcher for Trace RMI API exploration and development:</P>

View file

@ -0,0 +1,379 @@
/* ###
* IP: GHIDRA
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package agent.drgn.rmi;
import static org.junit.Assert.*;
import static org.junit.Assume.*;
import java.io.FileWriter;
import java.io.IOException;
import java.net.*;
import java.nio.file.*;
import java.util.Map;
import java.util.Objects;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.*;
import org.apache.commons.lang3.exception.ExceptionUtils;
import org.junit.Before;
import generic.jar.ResourceFile;
import ghidra.app.plugin.core.debug.gui.AbstractGhidraHeadedDebuggerTest;
import ghidra.app.plugin.core.debug.service.tracermi.TraceRmiPlugin;
import ghidra.app.plugin.core.debug.utils.ManagedDomainObject;
import ghidra.app.services.TraceRmiService;
import ghidra.debug.api.tracermi.*;
import ghidra.framework.*;
import ghidra.framework.main.ApplicationLevelOnlyPlugin;
import ghidra.framework.model.DomainFile;
import ghidra.framework.plugintool.Plugin;
import ghidra.framework.plugintool.PluginsConfiguration;
import ghidra.framework.plugintool.util.*;
import ghidra.pty.testutil.DummyProc;
import ghidra.util.Msg;
import junit.framework.AssertionFailedError;
public abstract class AbstractDrgnTraceRmiTest extends AbstractGhidraHeadedDebuggerTest {
protected static String CORE = "core.12137";
protected static String MDO = "/New Traces/" + CORE;
public static String PREAMBLE = """
import os
import drgn
import drgn.cli
os.environ['OPT_TARGET_KIND'] = 'coredump'
os.environ['OPT_TARGET_IMG'] = '$CORE'
from ghidradrgn.commands import *
""";
// Connecting should be the first thing the script does, so use a tight timeout.
protected static final int CONNECT_TIMEOUT_MS = 3000;
protected static final int TIMEOUT_SECONDS = 30000;
protected static final int QUIT_TIMEOUT_MS = 1000;
protected static boolean didSetupPython = false;
protected TraceRmiService traceRmi;
private Path pythonPath;
private Path outFile;
private Path errFile;
@Before
public void assertOS() {
assumeTrue(OperatingSystem.CURRENT_OPERATING_SYSTEM == OperatingSystem.LINUX);
}
//@BeforeClass
public static void setupPython() throws Throwable {
if (didSetupPython) {
// Only do this once when running the full suite.
return;
}
String gradle = DummyProc.which("gradle");
new ProcessBuilder(gradle, "Debugger-agent-drgn:assemblePyPackage")
.directory(TestApplicationUtils.getInstallationDirectory())
.inheritIO()
.start()
.waitFor();
didSetupPython = true;
}
protected void setPythonPath(ProcessBuilder pb) throws IOException {
String sep =
OperatingSystem.CURRENT_OPERATING_SYSTEM == OperatingSystem.LINUX ? ";" : ":";
String rmiPyPkg = Application.getModuleSubDirectory("Debugger-rmi-trace",
"build/pypkg/src").getAbsolutePath();
String drgnPyPkg = Application.getModuleSubDirectory("Debugger-agent-drgn",
"build/pypkg/src").getAbsolutePath();
String add = rmiPyPkg + sep + drgnPyPkg;
pb.environment().compute("PYTHONPATH", (k, v) -> v == null ? add : (v + sep + add));
}
@Before
public void setupTraceRmi() throws Throwable {
traceRmi = addPlugin(tool, TraceRmiPlugin.class);
try {
pythonPath = Paths.get(DummyProc.which("drgn"));
}
catch (RuntimeException e) {
Msg.error(this, e);
}
outFile = Files.createTempFile("drgnout", null);
errFile = Files.createTempFile("drgnerr", null);
}
protected void addAllDebuggerPlugins() throws PluginException {
PluginsConfiguration plugConf = new PluginsConfiguration() {
@Override
protected boolean accepts(Class<? extends Plugin> pluginClass) {
return !ApplicationLevelOnlyPlugin.class.isAssignableFrom(pluginClass);
}
};
for (PluginDescription pd : plugConf
.getPluginDescriptions(PluginPackage.getPluginPackage("Debugger"))) {
addPlugin(tool, pd.getPluginClass());
}
}
protected static String addrToStringForPython(InetAddress address) {
if (address.isAnyLocalAddress()) {
return "127.0.0.1"; // Can't connect to 0.0.0.0 as such. Choose localhost.
}
return address.getHostAddress();
}
protected static String sockToStringForPython(SocketAddress address) {
if (address instanceof InetSocketAddress tcp) {
return addrToStringForPython(tcp.getAddress()) + ":" + tcp.getPort();
}
throw new AssertionError("Unhandled address type " + address);
}
protected record PythonResult(boolean timedOut, int exitCode, String stdout, String stderr) {
protected String handle() {
if (stderr.contains("RuntimeError") || stderr.contains(" Error") || (0 != exitCode && 1 != exitCode && 143 != exitCode)) {
throw new PythonError(exitCode, stdout, stderr);
}
System.out.println("--stdout--");
System.out.println(stdout);
System.out.println("--stderr--");
System.out.println(stderr);
return stdout;
}
}
protected record ExecInDrgn(Process python, CompletableFuture<PythonResult> future) {
}
@SuppressWarnings("resource") // Do not close stdin
protected ExecInDrgn execInDrgn(String script) throws IOException {
ResourceFile rf = Application.getModuleDataFile("TestResources", CORE);
script = script.replace("$CORE", rf.getAbsolutePath());
Path fp = Files.createTempFile("test", ".py");
FileWriter fw = new FileWriter(fp.toFile());
fw.write(script);
fw.close();
ProcessBuilder pb = new ProcessBuilder(pythonPath.toString(), "-c",
rf.getAbsolutePath(), fp.toFile().getAbsolutePath());
setPythonPath(pb);
// If commands come from file, Python will quit after EOF.
Msg.info(this, "outFile: " + outFile);
Msg.info(this, "errFile: " + errFile);
//pb.inheritIO();
pb.redirectInput(ProcessBuilder.Redirect.PIPE);
pb.redirectOutput(outFile.toFile());
pb.redirectError(errFile.toFile());
Process pyproc = pb.start();
return new ExecInDrgn(pyproc, CompletableFuture.supplyAsync(() -> {
try {
if (!pyproc.waitFor(TIMEOUT_SECONDS, TimeUnit.SECONDS)) {
Msg.error(this, "Timed out waiting for Python");
pyproc.destroyForcibly();
pyproc.waitFor(TIMEOUT_SECONDS, TimeUnit.SECONDS);
return new PythonResult(true, -1, Files.readString(outFile),
Files.readString(errFile));
}
Msg.info(this, "Python exited with code " + pyproc.exitValue());
return new PythonResult(false, pyproc.exitValue(), Files.readString(outFile),
Files.readString(errFile));
}
catch (Exception e) {
return ExceptionUtils.rethrow(e);
}
finally {
pyproc.destroyForcibly();
}
}));
}
public static class PythonError extends RuntimeException {
public final int exitCode;
public final String stdout;
public final String stderr;
public PythonError(int exitCode, String stdout, String stderr) {
super("""
exitCode=%d:
----stdout----
%s
----stderr----
%s
""".formatted(exitCode, stdout, stderr));
this.exitCode = exitCode;
this.stdout = stdout;
this.stderr = stderr;
}
}
protected String runThrowError(String script) throws Exception {
CompletableFuture<PythonResult> result = execInDrgn(script).future;
return result.get(TIMEOUT_SECONDS, TimeUnit.SECONDS).handle();
}
protected record PythonAndConnection(ExecInDrgn exec, TraceRmiConnection connection)
implements AutoCloseable {
protected RemoteMethod getMethod(String name) {
return Objects.requireNonNull(connection.getMethods().get(name));
}
public void execute(String cmd) {
RemoteMethod execute = getMethod("execute");
execute.invoke(Map.of("cmd", cmd));
}
public RemoteAsyncResult executeAsync(String cmd) {
RemoteMethod execute = getMethod("execute");
return execute.invokeAsync(Map.of("cmd", cmd));
}
public String executeCapture(String cmd) {
RemoteMethod execute = getMethod("execute");
return (String) execute.invoke(Map.of("cmd", cmd, "to_string", true));
}
@Override
public void close() throws Exception {
Msg.info(this, "Cleaning up python");
exec.python().destroy();
try {
PythonResult r = exec.future.get(TIMEOUT_SECONDS, TimeUnit.SECONDS);
r.handle();
waitForPass(() -> assertTrue(connection.isClosed()));
}
finally {
exec.python.destroyForcibly();
}
}
}
protected PythonAndConnection startAndConnectDrgn(Function<String, String> scriptSupplier)
throws Exception {
TraceRmiAcceptor acceptor = traceRmi.acceptOne(null);
ExecInDrgn exec =
execInDrgn(scriptSupplier.apply(sockToStringForPython(acceptor.getAddress())));
acceptor.setTimeout(CONNECT_TIMEOUT_MS);
try {
TraceRmiConnection connection = acceptor.accept();
return new PythonAndConnection(exec, connection);
}
catch (SocketTimeoutException e) {
exec.python.destroyForcibly();
exec.future.get(TIMEOUT_SECONDS, TimeUnit.SECONDS).handle();
throw e;
}
}
protected PythonAndConnection startAndConnectDrgn() throws Exception {
return startAndConnectDrgn(addr -> """
%s
ghidra_trace_connect('%s')
drgn.cli.run_interactive(prog)
""".formatted(PREAMBLE, addr));
}
@SuppressWarnings("resource")
protected String runThrowError(Function<String, String> scriptSupplier)
throws Exception {
PythonAndConnection conn = startAndConnectDrgn(scriptSupplier);
PythonResult r = conn.exec.future.get(TIMEOUT_SECONDS, TimeUnit.SECONDS);
String stdout = r.handle();
//waitForPass(() -> assertTrue(conn.connection.isClosed()));
return stdout;
}
protected String extractOutSection(String out, String head) {
String[] split = out.split("\n");
String xout = "";
for (String s : split) {
if (!s.startsWith("(python)") && !s.equals("")) {
xout += s + "\n";
}
}
return xout.split(head)[1].split("---")[0].replace("(python)", "").trim();
}
protected ManagedDomainObject openDomainObject(String path) throws Exception {
DomainFile df = env.getProject().getProjectData().getFile(path);
assertNotNull(df);
return new ManagedDomainObject(df, false, false, monitor);
}
protected ManagedDomainObject waitDomainObject(String path) throws Exception {
DomainFile df;
long start = System.currentTimeMillis();
while (true) {
df = env.getProject().getProjectData().getFile(path);
if (df != null) {
return new ManagedDomainObject(df, false, false, monitor);
}
Thread.sleep(1000);
if (System.currentTimeMillis() - start > 30000) {
throw new TimeoutException("30 seconds expired waiting for domain file");
}
}
}
protected long getMaxSnap() {
Long maxSnap = tb.trace.getTimeManager().getMaxSnap();
return maxSnap == null ? 0 : maxSnap;
}
protected void waitTxDone() {
waitFor(() -> tb.trace.getCurrentTransactionInfo() == null);
}
public static void waitForPass(Runnable runnable) {
AtomicReference<AssertionError> lastError = new AtomicReference<>();
waitForCondition(() -> {
try {
runnable.run();
return true;
}
catch (AssertionError e) {
lastError.set(e);
return false;
}
}, () -> lastError.get().getMessage());
}
public static void waitForCondition(BooleanSupplier condition,
Supplier<String> failureMessageSupplier) throws AssertionFailedError {
int totalTime = 0;
while (totalTime <= DEFAULT_WAIT_TIMEOUT * 10) {
if (condition.getAsBoolean()) {
return; // success
}
totalTime += sleep(DEFAULT_WAIT_DELAY * 10);
}
String failureMessage = "Timed-out waiting for condition";
if (failureMessageSupplier != null) {
failureMessage = failureMessageSupplier.get();
}
throw new AssertionFailedError(failureMessage);
}
}

View file

@ -0,0 +1,909 @@
/* ###
* IP: GHIDRA
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package agent.drgn.rmi;
import static org.hamcrest.Matchers.*;
import static org.junit.Assert.*;
import java.util.*;
import java.util.concurrent.atomic.AtomicReference;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import org.junit.Test;
import db.Transaction;
import generic.Unique;
import ghidra.app.plugin.core.debug.utils.ManagedDomainObject;
import ghidra.debug.api.tracermi.TraceRmiAcceptor;
import ghidra.debug.api.tracermi.TraceRmiConnection;
import ghidra.framework.Application;
import ghidra.framework.model.DomainFile;
import ghidra.program.model.address.Address;
import ghidra.program.model.address.AddressSpace;
import ghidra.program.model.data.Float10DataType;
import ghidra.program.model.lang.RegisterValue;
import ghidra.program.model.listing.CodeUnit;
import ghidra.trace.database.ToyDBTraceBuilder;
import ghidra.trace.model.Lifespan;
import ghidra.trace.model.Trace;
import ghidra.trace.model.listing.TraceCodeSpace;
import ghidra.trace.model.memory.TraceMemoryRegion;
import ghidra.trace.model.memory.TraceMemorySpace;
import ghidra.trace.model.modules.TraceModule;
import ghidra.trace.model.target.TraceObject;
import ghidra.trace.model.target.TraceObjectValue;
import ghidra.trace.model.target.path.KeyPath;
import ghidra.trace.model.target.path.PathFilter;
import ghidra.trace.model.thread.TraceThread;
import ghidra.trace.model.time.TraceSnapshot;
import ghidra.util.Msg;
public class DrgnCommandsTest extends AbstractDrgnTraceRmiTest {
//@Test
public void testManual() throws Exception {
TraceRmiAcceptor acceptor = traceRmi.acceptOne(null);
Msg.info(this,
"Use: ghidra_trace_connect(" + sockToStringForPython(acceptor.getAddress()) + ")");
TraceRmiConnection connection = acceptor.accept();
Msg.info(this, "Connected: " + sockToStringForPython(connection.getRemoteAddress()));
connection.waitClosed();
Msg.info(this, "Closed");
}
@Test
public void testConnectErrorNoArg() throws Exception {
try {
runThrowError("""
from ghidradrgn.commands import *
ghidra_trace_connect()
quit()
""");
fail();
}
catch (PythonError e) {
assertThat(e.stderr, containsString("'ghidra_trace_connect'"));
assertThat(e.stderr, containsString("'address'"));
}
}
@Test
public void testConnect() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
quit()
""".formatted(PREAMBLE, addr));
}
@Test
public void testDisconnect() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_disconnect()
quit()
""".formatted(PREAMBLE, addr));
}
@Test
public void testStartTraceDefaults() throws Exception {
// Default name and lcsp
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
assertEquals("x86:LE:64:default",
tb.trace.getBaseLanguage().getLanguageID().getIdAsString());
assertEquals("gcc",
tb.trace.getBaseCompilerSpec().getCompilerSpecID().getIdAsString());
}
}
@Test
public void testStartTraceDefaultNoFile() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_start()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject("/New Traces/drgn/noname")) {
assertThat(mdo.get(), instanceOf(Trace.class));
}
}
@Test
public void testStartTraceCustomize() throws Exception {
runThrowError(
addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create(start_trace=False)
util.set_convenience_variable('ghidra-language','Toy:BE:64:default')
util.set_convenience_variable('ghidra-compiler','default')
ghidra_trace_start('myToy')
quit()
"""
.formatted(PREAMBLE, addr));
DomainFile df = env.getProject().getProjectData().getFile("/New Traces/myToy");
assertNotNull(df);
try (ManagedDomainObject mdo = new ManagedDomainObject(df, false, false, monitor)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
assertEquals("Toy:BE:64:default",
tb.trace.getBaseLanguage().getLanguageID().getIdAsString());
assertEquals("default",
tb.trace.getBaseCompilerSpec().getCompilerSpecID().getIdAsString());
}
}
@Test
public void testStopTrace() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_stop()
quit()
""".formatted(PREAMBLE, addr));
DomainFile df =
env.getProject().getProjectData().getFile(MDO);
assertNotNull(df);
}
@Test
public void testInfo() throws Exception {
AtomicReference<String> refAddr = new AtomicReference<>();
String out = runThrowError(addr -> {
refAddr.set(addr);
return """
%s
print('---Import---')
ghidra_trace_info()
print('---BeforeConnect---')
ghidra_trace_connect('%s')
print('---Connect---')
ghidra_trace_info()
print('---Create---')
ghidra_trace_create()
print('---Start---')
ghidra_trace_info()
ghidra_trace_stop()
print('---Stop---')
ghidra_trace_info()
ghidra_trace_disconnect()
print('---Disconnect---')
ghidra_trace_info()
quit()
""".formatted(PREAMBLE, addr);
});
assertEquals("""
Not connected to Ghidra""",
extractOutSection(out, "---Import---"));
assertEquals("""
Connected to %s %s at %s
No trace""".formatted(
Application.getName(), Application.getApplicationVersion(), refAddr.get()),
extractOutSection(out, "---Connect---").replaceAll("\r", ""));
assertEquals("""
Connected to %s %s at %s
Trace active""".formatted(
Application.getName(), Application.getApplicationVersion(), refAddr.get()),
extractOutSection(out, "---Start---").replaceAll("\r", ""));
assertEquals("""
Connected to %s %s at %s
No trace""".formatted(
Application.getName(), Application.getApplicationVersion(), refAddr.get()),
extractOutSection(out, "---Stop---").replaceAll("\r", ""));
assertEquals("""
Not connected to Ghidra""",
extractOutSection(out, "---Disconnect---"));
}
@Test
public void testLcsp() throws Exception {
String out = runThrowError(addr ->
"""
%s
ghidra_trace_connect('%s')
print('---Import---')
ghidra_trace_info_lcsp()
print('---Create---')
ghidra_trace_create()
print('---File---')
ghidra_trace_info_lcsp()
util.set_convenience_variable('ghidra-language','DATA:BE:64:default')
print('---Language---')
ghidra_trace_info_lcsp()
util.set_convenience_variable('ghidra-compiler','posStack')
print('---Compiler---')
ghidra_trace_info_lcsp()
quit()
""".formatted(PREAMBLE, addr));
assertEquals("""
Selected Ghidra language: x86:LE:64:default
Selected Ghidra compiler: gcc""",
extractOutSection(out, "---File---").replaceAll("\r", ""));
assertEquals("""
Using the DATA64 compiler map
Selected Ghidra language: DATA:BE:64:default
Selected Ghidra compiler: pointer64""",
extractOutSection(out, "---Language---").replaceAll("\r", ""));
assertEquals("""
Selected Ghidra language: DATA:BE:64:default
Selected Ghidra compiler: posStack""",
extractOutSection(out, "---Compiler---").replaceAll("\r", ""));
}
@Test
public void testSnapshot() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create snapshot')
ghidra_trace_new_snap('Scripted snapshot')
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceSnapshot snapshot = Unique.assertOne(tb.trace.getTimeManager().getAllSnapshots());
assertEquals(0, snapshot.getKey());
assertEquals("Scripted snapshot", snapshot.getDescription());
}
}
@Test
public void testPutreg() throws Exception {
String count = IntStream.iterate(0, i -> i < 32, i -> i + 1)
.mapToObj(Integer::toString)
.collect(Collectors.joining(",", "{", "}"));
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create snapshot')
ghidra_trace_new_snap('Scripted snapshot')
ghidra_trace_putreg()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr, count));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
long snap = Unique.assertOne(tb.trace.getTimeManager().getAllSnapshots()).getKey();
List<TraceObjectValue> regVals = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(0),
PathFilter.parse("Processes[].Threads[].Stack[].Registers"))
.map(p -> p.getLastEntry())
.toList();
TraceObjectValue tobj = regVals.get(0);
AddressSpace t1f0 = tb.trace.getBaseAddressFactory()
.getAddressSpace(tobj.getCanonicalPath().toString());
TraceMemorySpace regs = tb.trace.getMemoryManager().getMemorySpace(t1f0, false);
RegisterValue rip = regs.getValue(snap, tb.reg("rip"));
assertEquals("3a40cdf7ff7f0000", rip.getUnsignedValue().toString(16));
try (Transaction tx = tb.trace.openTransaction("Float80 unit")) {
TraceCodeSpace code = tb.trace.getCodeManager().getCodeSpace(t1f0, true);
code.definedData()
.create(Lifespan.nowOn(0), tb.reg("st0"), Float10DataType.dataType);
}
}
}
@Test
public void testDelreg() throws Exception {
String count = IntStream.iterate(0, i -> i < 32, i -> i + 1)
.mapToObj(Integer::toString)
.collect(Collectors.joining(",", "{", "}"));
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create snapshot')
ghidra_trace_new_snap('Scripted snapshot')
ghidra_trace_putreg()
ghidra_trace_delreg()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr, count));
// The spaces will be left over, but the values should be zeroed
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
long snap = Unique.assertOne(tb.trace.getTimeManager().getAllSnapshots()).getKey();
List<TraceObjectValue> regVals = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(0),
PathFilter.parse("Processes[].Threads[].Stack[].Registers"))
.map(p -> p.getLastEntry())
.toList();
TraceObjectValue tobj = regVals.get(0);
AddressSpace t1f0 = tb.trace.getBaseAddressFactory()
.getAddressSpace(tobj.getCanonicalPath().toString());
TraceMemorySpace regs = tb.trace.getMemoryManager().getMemorySpace(t1f0, false);
RegisterValue rax = regs.getValue(snap, tb.reg("rax"));
assertEquals("0", rax.getUnsignedValue().toString(16));
}
}
@Test
public void testCreateObj() throws Exception {
String out = runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_start()
ghidra_trace_txstart('Create Object')
print('---Id---')
ghidra_trace_create_obj('Test.Objects[1]')
print('---')
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject("/New Traces/drgn/noname")) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject object = tb.trace.getObjectManager()
.getObjectByCanonicalPath(KeyPath.parse("Test.Objects[1]"));
assertNotNull(object);
String created = extractOutSection(out, "---Id---");
long id = Long.parseLong(created.split("id=")[1].split(",")[0]);
assertEquals(object.getKey(), id);
}
}
@Test
public void testInsertObj() throws Exception {
String out = runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_start()
ghidra_trace_txstart('Create Object')
ghidra_trace_create_obj('Test.Objects[1]')
print('---Lifespan---')
ghidra_trace_insert_obj('Test.Objects[1]')
print('---')
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject("/New Traces/drgn/noname")) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject object = tb.trace.getObjectManager()
.getObjectByCanonicalPath(KeyPath.parse("Test.Objects[1]"));
assertNotNull(object);
Lifespan life = Unique.assertOne(object.getLife().spans());
assertEquals(Lifespan.nowOn(0), life);
assertEquals("Inserted object: lifespan=[0,+inf)",
extractOutSection(out, "---Lifespan---"));
}
}
@Test
public void testRemoveObj() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create Object')
ghidra_trace_create_obj('Test.Objects[1]')
ghidra_trace_insert_obj('Test.Objects[1]')
ghidra_trace_set_snap(1)
ghidra_trace_remove_obj('Test.Objects[1]')
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject object = tb.trace.getObjectManager()
.getObjectByCanonicalPath(KeyPath.parse("Test.Objects[1]"));
assertNotNull(object);
Lifespan life = Unique.assertOne(object.getLife().spans());
assertEquals(Lifespan.at(0), life);
}
}
@SuppressWarnings("unchecked")
protected <T> T runTestSetValue(String extra, String drgnExpr, String gtype)
throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create Object')
ghidra_trace_create_obj('Test.Objects[1]')
ghidra_trace_insert_obj('Test.Objects[1]')
%s
ghidra_trace_set_value('Test.Objects[1]', 'test', %s, '%s')
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr, extra, drgnExpr, gtype));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject object = tb.trace.getObjectManager()
.getObjectByCanonicalPath(KeyPath.parse("Test.Objects[1]"));
assertNotNull(object);
TraceObjectValue value = object.getValue(0, "test");
return value == null ? null : (T) value.getValue();
}
}
@Test
public void testSetValueNull() throws Exception {
assertNull(runTestSetValue("", "None", "VOID"));
}
@Test
public void testSetValueBool() throws Exception {
assertEquals(Boolean.TRUE, runTestSetValue("", "True", "BOOL"));
}
@Test
public void testSetValueByte() throws Exception {
assertEquals(Byte.valueOf((byte) 1), runTestSetValue("", "'1'", "BYTE"));
}
@Test
public void testSetValueChar() throws Exception {
assertEquals(Character.valueOf('A'), runTestSetValue("", "'A'", "CHAR"));
}
@Test
public void testSetValueShort() throws Exception {
assertEquals(Short.valueOf((short) 1), runTestSetValue("", "'1'", "SHORT"));
}
@Test
public void testSetValueInt() throws Exception {
assertEquals(Integer.valueOf(1), runTestSetValue("", "'1'", "INT"));
}
@Test
public void testSetValueLong() throws Exception {
assertEquals(Long.valueOf(1), runTestSetValue("", "'1'", "LONG"));
}
@Test
public void testSetValueString() throws Exception {
assertEquals("HelloWorld!", runTestSetValue("", "\'HelloWorld!\'", "STRING"));
}
@Test //- how do we input long strings in python
public void testSetValueStringWide() throws Exception {
assertEquals("HelloWorld!", runTestSetValue("", "u\'HelloWorld!\'", "STRING"));
}
@Test
public void testSetValueBoolArr() throws Exception {
assertArrayEquals(new boolean[] { true, false },
runTestSetValue("", "[True,False]", "BOOL_ARR"));
}
@Test
public void testSetValueByteArrUsingString() throws Exception {
assertArrayEquals(new byte[] { 'H', 1, 'W' },
runTestSetValue("", "'H\\1W'", "BYTE_ARR"));
}
@Test
public void testSetValueByteArrUsingArray() throws Exception {
assertArrayEquals(new byte[] { 'H', 0, 'W' },
runTestSetValue("", "['H',0,'W']", "BYTE_ARR"));
}
@Test
public void testSetValueCharArrUsingString() throws Exception {
assertArrayEquals(new char[] { 'H', 1, 'W' },
runTestSetValue("", "'H\\1W'", "CHAR_ARR"));
}
@Test
public void testSetValueCharArrUsingArray() throws Exception {
assertArrayEquals(new char[] { 'H', 0, 'W' },
runTestSetValue("", "['H',0,'W']", "CHAR_ARR"));
}
@Test
public void testSetValueShortArrUsingString() throws Exception {
assertArrayEquals(new short[] { 'H', 1, 'W' },
runTestSetValue("", "'H\\1W'", "SHORT_ARR"));
}
@Test
public void testSetValueShortArrUsingArray() throws Exception {
assertArrayEquals(new short[] { 'H', 0, 'W' },
runTestSetValue("", "['H',0,'W']", "SHORT_ARR"));
}
@Test
public void testSetValueIntArrayUsingMixedArray() throws Exception {
// Because explicit array type is chosen, we get null terminator
assertArrayEquals(new int[] { 'H', 0, 'W' },
runTestSetValue("", "['H',0,'W']", "INT_ARR"));
}
@Test
public void testSetValueIntArrUsingArray() throws Exception {
assertArrayEquals(new int[] { 1, 2, 3, 4 },
runTestSetValue("", "[1,2,3,4]", "INT_ARR"));
}
@Test
public void testSetValueLongArr() throws Exception {
assertArrayEquals(new long[] { 1, 2, 3, 4 },
runTestSetValue("", "[1,2,3,4]", "LONG_ARR"));
}
@Test
public void testSetValueStringArr() throws Exception {
assertArrayEquals(new String[] { "1", "A", "dead", "beef" },
runTestSetValue("", "['1','A','dead','beef']", "STRING_ARR"));
}
@Test
public void testSetValueAddress() throws Exception {
Address address = runTestSetValue("", "0xdeadbeef", "ADDRESS");
// Don't have the address factory to create expected address
assertEquals(0xdeadbeefL, address.getOffset());
assertEquals("ram", address.getAddressSpace().getName());
}
@Test
public void testSetValueObject() throws Exception {
TraceObject object = runTestSetValue("", "'Test.Objects[1]'", "OBJECT");
assertEquals("Test.Objects[1]", object.getCanonicalPath().toString());
}
@Test
public void testRetainValues() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create Object')
ghidra_trace_create_obj('Test.Objects[1]')
ghidra_trace_insert_obj('Test.Objects[1]')
ghidra_trace_set_value('Test.Objects[1]', '[1]', '"A"', 'STRING')
ghidra_trace_set_value('Test.Objects[1]', '[2]', '"B"', 'STRING')
ghidra_trace_set_value('Test.Objects[1]', '[3]', '"C"', 'STRING')
ghidra_trace_set_snap(10)
ghidra_trace_retain_values('Test.Objects[1]', '[1] [3]')
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject object = tb.trace.getObjectManager()
.getObjectByCanonicalPath(KeyPath.parse("Test.Objects[1]"));
assertNotNull(object);
assertEquals(Map.ofEntries(
Map.entry("[1]", Lifespan.nowOn(0)),
Map.entry("[2]", Lifespan.span(0, 9)),
Map.entry("[3]", Lifespan.nowOn(0))),
object.getValues(Lifespan.ALL)
.stream()
.collect(Collectors.toMap(v -> v.getEntryKey(), v -> v.getLifespan())));
}
}
@Test
public void testGetObj() throws Exception {
String out = runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_start()
ghidra_trace_txstart('Create Object')
print('---Id---')
ghidra_trace_create_obj('Test.Objects[1]')
print('---')
ghidra_trace_txcommit()
print('---GetObject---')
ghidra_trace_get_obj('Test.Objects[1]')
print('---')
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject("/New Traces/drgn/noname")) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject object = tb.trace.getObjectManager()
.getObjectByCanonicalPath(KeyPath.parse("Test.Objects[1]"));
assertNotNull(object);
assertEquals("1\tTest.Objects[1]", extractOutSection(out, "---GetObject---"));
}
}
@Test
public void testGetValues() throws Exception {
String out = runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create Object')
ghidra_trace_create_obj('Test.Objects[1]')
ghidra_trace_insert_obj('Test.Objects[1]')
ghidra_trace_set_value('Test.Objects[1]', 'vnull', None, 'VOID')
ghidra_trace_set_value('Test.Objects[1]', 'vbool', True, 'BOOL')
ghidra_trace_set_value('Test.Objects[1]', 'vbyte', '1', 'BYTE')
ghidra_trace_set_value('Test.Objects[1]', 'vchar', 'A', 'CHAR')
ghidra_trace_set_value('Test.Objects[1]', 'vshort', '2', 'SHORT')
ghidra_trace_set_value('Test.Objects[1]', 'vint', '3', 'INT')
ghidra_trace_set_value('Test.Objects[1]', 'vlong', '4', 'LONG')
ghidra_trace_set_value('Test.Objects[1]', 'vstring', 'Hello', 'STRING')
vboolarr = [True, False]
ghidra_trace_set_value('Test.Objects[1]', 'vboolarr', vboolarr, 'BOOL_ARR')
vbytearr = [1, 2, 3]
ghidra_trace_set_value('Test.Objects[1]', 'vbytearr', vbytearr, 'BYTE_ARR')
vchararr = 'Hello'
ghidra_trace_set_value('Test.Objects[1]', 'vchararr', vchararr, 'CHAR_ARR')
vshortarr = [1, 2, 3]
ghidra_trace_set_value('Test.Objects[1]', 'vshortarr', vshortarr, 'SHORT_ARR')
vintarr = [1, 2, 3]
ghidra_trace_set_value('Test.Objects[1]', 'vintarr', vintarr, 'INT_ARR')
vlongarr = [1, 2, 3]
ghidra_trace_set_value('Test.Objects[1]', 'vlongarr', vlongarr, 'LONG_ARR')
ghidra_trace_set_value('Test.Objects[1]', 'vaddr', 0xdeadbeef, 'ADDRESS')
ghidra_trace_set_value('Test.Objects[1]', 'vobj', 'Test.Objects[1]', 'OBJECT')
ghidra_trace_txcommit()
print('---GetValues---')
ghidra_trace_get_values('Test.Objects[1].')
print('---')
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
assertEquals("""
Parent Key Span Value Type
Test.Objects[1] vaddr [0,+inf) ram:deadbeef ADDRESS
Test.Objects[1] vbool [0,+inf) True BOOL
Test.Objects[1] vboolarr [0,+inf) [True, False] BOOL_ARR
Test.Objects[1] vbyte [0,+inf) 1 BYTE
Test.Objects[1] vbytearr [0,+inf) b'\\x01\\x02\\x03' BYTE_ARR
Test.Objects[1] vchar [0,+inf) 'A' CHAR
Test.Objects[1] vchararr [0,+inf) 'Hello' CHAR_ARR
Test.Objects[1] vint [0,+inf) 3 INT
Test.Objects[1] vintarr [0,+inf) [1, 2, 3] INT_ARR
Test.Objects[1] vlong [0,+inf) 4 LONG
Test.Objects[1] vlongarr [0,+inf) [1, 2, 3] LONG_ARR
Test.Objects[1] vobj [0,+inf) Test.Objects[1] OBJECT
Test.Objects[1] vshort [0,+inf) 2 SHORT
Test.Objects[1] vshortarr [0,+inf) [1, 2, 3] SHORT_ARR
Test.Objects[1] vstring [0,+inf) 'Hello' STRING""",
extractOutSection(out, "---GetValues---").replaceAll("\r", ""));
}
}
@Test
public void testGetValuesRng() throws Exception {
String out = runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Create Object')
ghidra_trace_create_obj('Test.Objects[1]')
ghidra_trace_insert_obj('Test.Objects[1]')
ghidra_trace_set_value('Test.Objects[1]', 'vaddr', 0xdeadbeef, 'ADDRESS')
ghidra_trace_txcommit()
print('---GetValues---')
ghidra_trace_get_values_rng(0xdeadbeef, 10)
print('---')
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
assertEquals("""
Parent Key Span Value Type
Test.Objects[1] vaddr [0,+inf) ram:deadbeef ADDRESS""",
extractOutSection(out, "---GetValues---").replaceAll("\r", ""));
}
}
@Test
public void testActivateObject() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
#set language c++
ghidra_trace_txstart('Create Object')
ghidra_trace_create_obj('Test.Objects[1]')
ghidra_trace_insert_obj('Test.Objects[1]')
ghidra_trace_txcommit()
ghidra_trace_activate('Test.Objects[1]')
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
assertSame(mdo.get(), traceManager.getCurrentTrace());
assertEquals("Test.Objects[1]",
traceManager.getCurrentObject().getCanonicalPath().toString());
}
}
@Test
public void testDisassemble() throws Exception {
String out = runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Tx')
pc = get_pc()
ghidra_trace_putmem(pc, 16)
print('---Disassemble---')
ghidra_trace_disassemble(pc)
print('---')
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
// Not concerned about specifics, so long as disassembly occurs
long total = 0;
for (CodeUnit cu : tb.trace.getCodeManager().definedUnits().get(0, true)) {
total += cu.getLength();
}
String extract = extractOutSection(out, "---Disassemble---");
String[] split = extract.split("\r\n");
// NB: core.12137 has no memory
//assertEquals("Disassembled %d bytes".formatted(total),
// split[0]);
assertEquals(0, total);
assertEquals("", split[0]);
}
}
@Test
public void testPutProcesses() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_start()
ghidra_trace_txstart('Tx')
ghidra_trace_put_processes()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject("/New Traces/drgn/noname")) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
// Would be nice to control / validate the specifics
Collection<TraceObject> processes = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(0), PathFilter.parse("Processes[]"))
.map(p -> p.getDestination(null))
.toList();
assertEquals(0, processes.size());
}
}
@Test
public void testPutEnvironment() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Tx')
ghidra_trace_put_environment()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
// Assumes LLDB on Linux amd64
TraceObject envobj =
Objects.requireNonNull(tb.objAny("Processes[].Environment", Lifespan.at(0)));
assertEquals("drgn", envobj.getValue(0, "_debugger").getValue());
assertEquals("X86_64", envobj.getValue(0, "_arch").getValue());
assertEquals("Language.C", envobj.getValue(0, "_os").getValue());
assertEquals("little", envobj.getValue(0, "_endian").getValue());
}
}
@Test
public void testPutRegions() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Tx')
ghidra_trace_put_regions()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
// Would be nice to control / validate the specifics
Collection<? extends TraceMemoryRegion> all =
tb.trace.getMemoryManager().getAllRegions();
assertThat(all.size(), greaterThan(2));
}
}
@Test
public void testPutModules() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Tx')
ghidra_trace_put_modules()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
// Would be nice to control / validate the specifics
Collection<? extends TraceModule> all = tb.trace.getModuleManager().getAllModules();
TraceModule modBash =
Unique.assertOne(all.stream().filter(m -> m.getName().contains("helloWorld")));
assertNotEquals(tb.addr(0), Objects.requireNonNull(modBash.getBase()));
}
}
@Test
public void testPutThreads() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Tx')
ghidra_trace_put_threads()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
// Would be nice to control / validate the specifics
Collection<? extends TraceThread> threads = tb.trace.getThreadManager().getAllThreads();
assertEquals(1, threads.size());
}
}
@Test
public void testPutFrames() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
ghidra_trace_create()
ghidra_trace_txstart('Tx')
ghidra_trace_put_frames()
ghidra_trace_txcommit()
quit()
""".formatted(PREAMBLE, addr));
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
// Would be nice to control / validate the specifics
List<TraceObject> stack = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(0),
PathFilter.parse("Processes[0].Threads[].Stack[]"))
.map(p -> p.getDestination(null))
.toList();
assertEquals(7, stack.size());
}
}
@Test
public void testMinimal() throws Exception {
runThrowError(addr -> """
%s
ghidra_trace_connect('%s')
print('FINISHED')
quit()
""".formatted(PREAMBLE, addr));
}
}

View file

@ -0,0 +1,286 @@
/* ###
* IP: GHIDRA
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package agent.drgn.rmi;
import static org.hamcrest.Matchers.*;
import static org.junit.Assert.*;
import java.util.*;
import org.junit.Test;
import generic.Unique;
import generic.jar.ResourceFile;
import ghidra.app.plugin.core.debug.utils.ManagedDomainObject;
import ghidra.debug.api.tracermi.RemoteMethod;
import ghidra.framework.Application;
import ghidra.program.model.address.AddressSpace;
import ghidra.program.model.lang.RegisterValue;
import ghidra.trace.database.ToyDBTraceBuilder;
import ghidra.trace.model.Lifespan;
import ghidra.trace.model.Trace;
import ghidra.trace.model.memory.TraceMemoryRegion;
import ghidra.trace.model.memory.TraceMemorySpace;
import ghidra.trace.model.modules.TraceModule;
import ghidra.trace.model.target.TraceObject;
import ghidra.trace.model.target.path.PathFilter;
import ghidra.trace.model.target.path.PathPattern;
public class DrgnMethodsTest extends AbstractDrgnTraceRmiTest {
@Test
public void testExecuteCapture() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
RemoteMethod execute = conn.getMethod("execute");
assertEquals(false, execute.parameters().get("to_string").getDefaultValue());
assertEquals("11\n",
execute.invoke(Map.of(
"cmd", "print(3+4*2)",
"to_string", true)));
}
}
@Test
public void testExecute() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
start(conn, null);
}
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
// Just confirm it's present
}
}
@Test
public void testRefreshProcesses() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
start(conn, null);
txCreate(conn, "Processes");
RemoteMethod attachCore = conn.getMethod("attach_core");
RemoteMethod refreshProcesses = conn.getMethod("refresh_processes");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject processes = Objects.requireNonNull(tb.objAny0("Processes"));
refreshProcesses.invoke(Map.of("node", processes));
List<TraceObject> list = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(getMaxSnap()), PathFilter.parse("Processes[]"))
.map(p -> p.getDestination(null))
.toList();
assertEquals(1, list.size());
ResourceFile rf = Application.getModuleDataFile("TestResources", CORE);
attachCore.invoke(Map.of("processes", processes, "core", rf.getAbsolutePath()));
refreshProcesses.invoke(Map.of("node", processes));
list = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(getMaxSnap()), PathFilter.parse("Processes[]"))
.map(p -> p.getDestination(null))
.toList();
assertEquals(2, list.size());
}
}
}
@Test
public void testRefreshEnvironment() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
String path = "Processes[].Environment";
start(conn, null);
txPut(conn, "all");
RemoteMethod refreshEnvironment = conn.getMethod("refresh_environment");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject envobj = Objects.requireNonNull(tb.objAny0(path));
refreshEnvironment.invoke(Map.of("node", envobj));
assertEquals("drgn", envobj.getValue(0, "_debugger").getValue());
assertEquals("X86_64", envobj.getValue(0, "_arch").getValue());
assertEquals("Language.C", envobj.getValue(0, "_os").getValue());
assertEquals("little", envobj.getValue(0, "_endian").getValue());
}
}
}
@Test
public void testRefreshThreads() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
String path = "Processes[].Threads";
start(conn, null);
txCreate(conn, path);
RemoteMethod refreshThreads = conn.getMethod("refresh_threads");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject threads = Objects.requireNonNull(tb.objAny0(path));
refreshThreads.invoke(Map.of("node", threads));
int listSize = tb.trace.getThreadManager().getAllThreads().size();
assertEquals(1, listSize);
}
}
}
@Test
public void testRefreshStack() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
String path = "Processes[].Threads[].Stack";
start(conn, null);
txPut(conn, "processes");
RemoteMethod refreshStack = conn.getMethod("refresh_stack");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
txPut(conn, "frames");
TraceObject stack = Objects.requireNonNull(tb.objAny0(path));
refreshStack.invoke(Map.of("node", stack));
// Would be nice to control / validate the specifics
List<TraceObject> list = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(0),
PathFilter.parse("Processes[].Threads[].Stack[]"))
.map(p -> p.getDestination(null))
.toList();
assertEquals(7, list.size());
}
}
}
@Test
public void testRefreshRegisters() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
String path = "Processes[].Threads[].Stack[].Registers";
start(conn, null);
conn.execute("ghidra_trace_txstart('Tx')");
conn.execute("ghidra_trace_putreg()");
conn.execute("ghidra_trace_delreg()");
conn.execute("ghidra_trace_txcommit()");
RemoteMethod refreshRegisters = conn.getMethod("refresh_registers");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject registers = Objects.requireNonNull(tb.objAny(path, Lifespan.at(0)));
refreshRegisters.invoke(Map.of("node", registers));
long snap = 0;
AddressSpace t1f0 = tb.trace.getBaseAddressFactory()
.getAddressSpace(registers.getCanonicalPath().toString());
TraceMemorySpace regs = tb.trace.getMemoryManager().getMemorySpace(t1f0, false);
RegisterValue rip = regs.getValue(snap, tb.reg("rip"));
assertEquals("3a40cdf7ff7f0000", rip.getUnsignedValue().toString(16));
}
}
}
@Test
public void testRefreshMappings() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
String path = "Processes[].Memory";
start(conn, null);
txCreate(conn, path);
RemoteMethod refreshMappings = conn.getMethod("refresh_mappings");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject memory = Objects.requireNonNull(tb.objAny0(path));
refreshMappings.invoke(Map.of("node", memory));
// Would be nice to control / validate the specifics
Collection<? extends TraceMemoryRegion> all =
tb.trace.getMemoryManager().getAllRegions();
assertThat(all.size(), greaterThan(2));
}
}
}
@Test
public void testRefreshModules() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
String path = "Processes[].Modules";
start(conn, null);
txCreate(conn, path);
RemoteMethod refreshModules = conn.getMethod("refresh_modules");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
TraceObject modules = Objects.requireNonNull(tb.objAny0(path));
refreshModules.invoke(Map.of("node", modules));
// Would be nice to control / validate the specifics
Collection<? extends TraceModule> all = tb.trace.getModuleManager().getAllModules();
TraceModule mod =
Unique.assertOne(all.stream().filter(m -> m.getName().contains("helloWorld")));
assertNotEquals(tb.addr(0), Objects.requireNonNull(mod.getBase()));
}
}
}
@Test
public void testActivateThread() throws Exception {
try (PythonAndConnection conn = startAndConnectDrgn()) {
start(conn, null);
txPut(conn, "processes");
RemoteMethod activateThread = conn.getMethod("activate_thread");
try (ManagedDomainObject mdo = openDomainObject(MDO)) {
tb = new ToyDBTraceBuilder((Trace) mdo.get());
txPut(conn, "threads");
PathPattern pattern =
PathFilter.parse("Processes[].Threads[]").getSingletonPattern();
List<TraceObject> list = tb.trace.getObjectManager()
.getValuePaths(Lifespan.at(0), pattern)
.map(p -> p.getDestination(null))
.toList();
assertEquals(1, list.size());
for (TraceObject t : list) {
activateThread.invoke(Map.of("thread", t));
String out = conn.executeCapture("print(util.selected_thread())").strip();
List<String> indices = pattern.matchKeys(t.getCanonicalPath(), true);
assertEquals("%s".formatted(indices.get(1)), out);
}
}
}
}
private void start(PythonAndConnection conn, String obj) {
conn.execute("from ghidradrgn.commands import *");
conn.execute("ghidra_trace_create()");
}
private void txPut(PythonAndConnection conn, String obj) {
conn.execute("ghidra_trace_txstart('Tx')");
conn.execute("ghidra_trace_put_" + obj + "()");
conn.execute("ghidra_trace_txcommit()");
}
private void txCreate(PythonAndConnection conn, String path) {
conn.execute("ghidra_trace_txstart('Fake')");
conn.execute("ghidra_trace_create_obj('%s')".formatted(path));
conn.execute("ghidra_trace_txcommit()");
}
}

View file

@ -0,0 +1,148 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>Ghidra Debugger</title>
<style type="text/css">
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
</style>
<link rel="stylesheet" href="style.css" />
</head>
<body>
<header id="nav"><a
class="beginner" href="A1-GettingStarted.html">Getting Started</a><a
class="beginner" href="A2-UITour.html">UI Tour</a><a
class="beginner" href="A3-Breakpoints.html">Breakpoints</a><a
class="beginner" href="A4-MachineState.html">Machine State</a><a
class="beginner" href="A5-Navigation.html">Navigation</a><a
class="beginner" href="A6-MemoryMap.html">Memory Map</a><a
class="advanced" href="B1-RemoteTargets.html">Remote Targets</a><a
class="advanced" href="B2-Emulation.html">Emulation</a><a
class="advanced" href="B3-Scripting.html">Scripting</a><a
class="advanced" href="B4-Modeling.html">Modeling</a><a
class="advanced" href="B5-AddingDebuggers.html">Adding Debuggers</a>
</header>
<header id="title-block-header">
<h1 class="title">Ghidra Debugger</h1>
</header>
<nav id="TOC">
<ul>
<li><a href="#adding-a-debugger">Adding a debugger</a><ul>
<li><a href="#debugger-documentation">Debugger documentation</a></li>
<li><a href="#anatomy-of-a-ghidra-debugger-agent">Anatomy of a Ghidra debugger agent</a></li>
<li><a href="#drgn-as-an-example">drgn as an Example</a><ul>
<li><a href="#the-first-launcher-local-drgn.sh">The first launcher — <code>local-drgn.sh</code></a></li>
<li><a href="#the-schema">The schema</a></li>
<li><a href="#the-build-logic">The build logic</a></li>
<li><a href="#the-python-files">The Python files</a></li>
<li><a href="#revisiting-the-schema">Revisiting the schema</a></li>
<li><a href="#unit-tests">Unit tests</a></li>
<li><a href="#documentation">Documentation</a></li>
<li><a href="#extended-features">Extended features</a></li>
</ul></li>
</ul></li>
</ul>
</nav>
<section id="adding-a-debugger" class="level1">
<h1>Adding a debugger</h1>
<p>This module walks you through an example of how to add a debugger agent to Ghidra. It has no exercises and is certainly not the only way to implement an agent, but hopefully contains some useful pointers and highlights some pit-falls that you might encounter. The example traces the implementation of an actual agent — the agent for <em>Meta</em>s <strong>drgn</strong> debugger, which provides a scriptable, albeit read-only, interface to the running Linux kernel, as well as user-mode and core-dump targets.</p>
<section id="debugger-documentation" class="level2">
<h2>Debugger documentation</h2>
<ul>
<li>Recommended reading: <strong>drgn</strong> (<a href="https://github.com/osandov/drgn" class="uri">https://github.com/osandov/drgn</a>)</li>
<li>Also: <strong>drgn (docs)</strong> (<a href="https://drgn.readthedocs.io/en/latest" class="uri">https://drgn.readthedocs.io/en/latest</a>)</li>
</ul>
</section>
<section id="anatomy-of-a-ghidra-debugger-agent" class="level2">
<h2>Anatomy of a Ghidra debugger agent</h2>
<p>To support debugging on various platforms, the Ghidra debugger has <em>agents</em>, i.e. clients capable of receiving information from a native debugger and passing it to the Ghidra GUI. They include the <strong>dbgeng</strong> agent that supports Windows debuggers, the <strong>gdb</strong> agent for gdb on a variery of platforms, the <strong>lldb</strong> agent for macOS and Linux, and the <strong>jpda</strong> agent for Java. All but the last are written in Python 3, and all communicate with the GUI via a protobuf-based protocol described in <a href="../../../Ghidra/Debug/Debugger-rmi-trace/src/main/proto/trace-rmi.proto">Debugger-rmi-trace</a>.</p>
<p>At the highest level, each agent has four elements (ok, a somewhat arbitrary division, but…):</p>
<ul>
<li><a href="../../../Ghidra/Debug/Debugger-agent-drgn/data/debugger-launchers"><code>debugger-launchers</code></a> A set of launchers, often a mixture of <code>.bat</code>,<code>.sh</code>, and sometime <code>.py</code> scripts</li>
<li><a href="../../../Ghidra/Debug/Debugger-agent-drgn/src/main/py/src/ghidradrgn/schema.xml"><code>schema.xml</code></a> An object-model schema. (While expressed in XML, this is not an “XML schema”.)</li>
<li><a href="../../../Ghidra/Debug/Debugger-agent-drgn/src/main/py/src/ghidradrgn"><code>src/ghidradrgn</code></a> Python files for architecture, commands, hooks, methods, and common utility functions</li>
<li><a href="../../../Ghidra/Debug/Debugger-agent-drgn/build.gradle"><code>build.gradle</code></a> Build logic</li>
</ul>
<p>Large portions of each are identical or similar across agents, so, as a general strategy, copying an existing agent and renaming all agent-specific variables, methods, etc. is not the worst plan of action. Typically, this leads to large chunks of detritus that need to be edited out late in the development process.</p>
</section>
<section id="drgn-as-an-example" class="level2">
<h2>drgn as an Example</h2>
<section id="the-first-launcher-local-drgn.sh" class="level3">
<h3>The first launcher — <code>local-drgn.sh</code></h3>
<p>The initial objective is to create a shell that sets up the environment variables for parameters well need and invokes the target. For this project, I originally started duplicating the <strong>lldb</strong> agent and then switched to the <strong>dbgeng</strong> agent. Why? The hardest part of writing an agent is getting the initial launch pattern correct. <strong>drgn</strong> is itself written in Python. While gdb and lldb support Python as scripting languages, their cores are not Python-based. For these debuggers, the launcher runs the native debugger and instructs it to load our plugin, which is the agent. The dbgeng agent inverts this pattern, i.e. the agent is a Python application that uses the <strong>Pybag</strong> package to access the native <em>kd</em> interface over COM. <strong>drgn</strong> follows this pattern.</p>
<p>That said, a quick look at the launchers in the <strong>dbgeng</strong> project (under <a href="../../../Ghidra/Debug/Debugger-agent-dbgeng/data/debugger-launchers"><code>debugger-launchers</code></a>) shows <code>.bat</code> files, each of which calls a <code>.py</code> file in <a href="../../../Ghidra/Debug/Debugger-agent-dbgeng/data/support"><code>data/support</code></a>. As <strong>drgn</strong> is a Linux-only debugger, we need to convert the <code>.bat</code> examples to <code>.sh</code>. Luckily, the conversion is pretty simple: most line annotations use <code>#</code> in place of <code>::</code> and environment variables are referenced using <code>$VAR</code> in place of <code>%VAR%</code>.</p>
<p>The syntax of the <code>.sh</code> is typical of any <em>*nix</em> shell. In addition to the shell script, a launcher include a metadata header to populate its menu and options dialog. Annotations include:</p>
<ul>
<li>A <code>#!</code> line for the shell invocation</li>
<li>The Ghidra license</li>
<li>A <code>#@title</code> line for the launcher name</li>
<li>A <code>#@desc</code>-annotated HTML description, as displayed in the launch dialog</li>
<li><code>#@menu-group</code> for organizing launchers</li>
<li><code>#@icon</code> for an icon</li>
<li><code>#@help</code> the help file and anchor</li>
<li>Some number of <code>#@arg</code> variables, usually only one to name the executable image</li>
<li><code>#@args</code> specifies the remainder of the arguments, passed to a user-mode target if applicable</li>
<li>Some number of <code>#@env</code> variables referenced by the Python code</li>
</ul>
<p>While the <strong>drgn</strong> launcher does not use <code>@arg</code> or <code>@args</code>, there are plentiful examples in the <a href="../../../Ghidra/Debug/Debugger-agent-gdb/data/debugger-launchers"><strong>gdb</strong> project</a>. The <code>#@env</code> lines are composed of the variable name (usually in caps), its type, default value, a label for the dialog if the user need to be queried, and a description. The syntax looks like:</p>
<ul>
<li><code>#@env</code> <em>Name</em> <code>:</code> <em>Type</em> [ <code>!</code> ] <code>=</code> <em>DefaultValue</em> <em>Label</em> <em>Description</em></li>
</ul>
<p>where <code>!</code>, if present, indicates the option is required.</p>
<p>For <strong>drgn</strong>, invoking the <code>drgn</code> command directly saves us a lot of the work involved in getting the environment correct. We pass it our Python launcher <code>local-drgn.py</code> instead of allowing it to call <code>run_interactive</code>, which does not return. Instead, we created an instance of <code>prog</code> based on the parameters, complete the Ghidra-specific initialization, and call <code>run_interactive(prog)</code> ourselves.</p>
<p>The Python script needs to do the setup work for Ghidra and for <strong>drgn</strong>. A good start is to try to implement a script that calls the methods for <code>connect</code>, <code>create</code>, and <code>start</code>, with <code>create</code> doing as little as possible initially. This should allow you to work the kinks out of <code>arch.py</code> and <code>util.py</code>.</p>
<p>For this particular target, there are some interesting wrinkles surrounding the use of <code>sudo</code> (required for most targets) which complicate where wheels are installed (i.e. it is pretty easy to accidentally mix user-local and system <code>site-packages</code>). Additionally, the <code>-E</code> parameter is required to ensure that the environment variable we defined get passed to the root environment. In the cases where we use <code>sudo</code>, the first message printed in the interactive shell will be the request for the users password.</p>
</section>
<section id="the-schema" class="level3">
<h3>The schema</h3>
<p>The schema, specified in <code>schema.xml</code>, provides a basic structure for Ghidras <strong>Model</strong> View and allows Ghidra to identify and locate various interfaces that are used to populate the GUI. For example, the <em>Memory</em> interface identifies the container for items with the interface <em>MemoryRegion</em>, which provide information used to fill the <strong>Memory</strong> View. Among the important interfaces are <em>Process</em>, <em>Thread</em>, <em>Frame</em>, <em>Register</em>, <em>MemoryRegion</em>, <em>Module</em>, and <em>Section</em>. These interfaces are “built into” Ghidra so that it can identify which objects provide specific information and commands.</p>
<p>For the purposes of getting started, its easiest to clone the <strong>dbgeng</strong> schema and modify it as needed. Again, this will require substantial cleanup later on, but, as schema errors are frequently subtle and hard to identify, revisiting is probably the better approach. <code>MANIFEST.in</code> should be modfied to reflect the schemas path.</p>
</section>
<section id="the-build-logic" class="level3">
<h3>The build logic</h3>
<p>Similarly, <code>build.gradle</code> can essentially be cloned from <strong>dbgeng</strong>, with the appropriate change to <code>eclipse.project.name</code>. For the most part, you need only apply the <code>distributableGhidraModule.gradle</code> and <code>hasPythonPackage.gradle</code> scripts. If further customization is needed, consult other examples in the Ghidra project and Gradles documentation.</p>
<p>Not perhaps directly a build logic item, but <code>pyproject.toml</code> should be modified to reflect the agents version number (by convention, Ghidras version number).</p>
</section>
<section id="the-python-files" class="level3">
<h3>The Python files</h3>
<p>At this point, we can start actually implementing the <strong>drgn</strong> agent. <code>arch.py</code> is usually a good starting point, as much of the initial logic depends on it. For <code>arch.py</code>, the hard bit is knowing what maps to what. The <code>language_map</code> converts the debuggers self-reported architecture to Ghidras language set. Ghidras languages are mapped to a set of language-to-compiler maps, which are then used to map the debuggers self-reported language to Ghidras compiler. Certain combinations are not allowed because Ghidra has no concept of that language-compiler combination. For example, x86 languages never map to <code>default</code>. Hence, the need for a <code>x86_compiler_map</code>, which defaults to something else (in this case, <code>gcc</code>).</p>
<p>After <code>arch.py</code>, a first pass at <code>util.py</code> is probably warranted. In particular, the version info is used early in the startup process. A lot of this code is not relevant to our current project, but at a minimum we want to implement (or fake out) methods such as <code>selected_process</code>, <code>selected_thread</code>, and <code>selected_frame</code>. In this example, there probably wont be more than one session or one process. Ultimately, well have to decide whether we even want <em>Session</em> in the schema. For now, were defaulting session and process to 0, and thread to 1, as 0 is invalid for debugging the kernel. (Later, it becomes obvious that the attached pid and <code>prog.main_thread().tid</code> make sense for user-mode debugging, and <code>prog.crashed_thread().tid</code> makes sense for crash dump debugging.)</p>
<p>With <code>arch.py</code> and <code>util.py</code> good to a first approximation, we would normally start implementing <code>put</code> methods in <code>commands.py</code> for various objects in the <strong>Model</strong> View, starting at the root of the tree and descending through the children. Again, <em>Session</em> and <em>Process</em> are rather poorly-defined, so we skip them (leaving one each) and tackle <em>Threads</em>. Typically, for each iterator in the debugger API, two commands get implemented — one internal method that does the actual work, e.g. <code>put_threads()</code> and one invokable method that wraps this method in a (potentialy batched) transaction, e.g. <code>ghidra_trace_put_threads()</code>. The internal methods are meant to be called by other Python code, with the caller assumed to be responsible for setting up the transaction. The <code>ghidra_trace</code>-prefixed methods are meant to be part of the custom CLI command set which the user can invoke and therefore should set up the transaction. The internal method typically creates the path to the container using patterns for the container, individual keys, and the combination, e.g. <code>THREADS_PATTERN</code>, <code>THREAD_KEY_PATTERN</code>, and <code>THREAD_PATTERN</code>. Patterns are built up from other patterns, going back to the root. A trace object corresponding to the debugger object is created from the path and inserted into the trace database.</p>
<p>Once this code has been tested, attributes of the object can be added to the base object using <code>set_value</code>. Attributes that are not primitives can be added using the pattern create-populate-insert, i.e. we call <code>create_object</code> with extensions to the path, populate the objects children, and call <code>insert</code> with the created object. In many cases (particularly when populating an objects children is expensive), you may want to defer the populate step, effectively creating a placeholder that can be populated on-demand. The downside of this approach, of course, is that <em>refresh</em> methods must be added to populate those nodes.</p>
<p>As an aside, its probably worth noting the function of <code>create_object</code> and <code>insert</code>. Objects in the trace are maintained in a directory tree, with links (and backlinks) allowed, whose visible manifestation is the <strong>Model</strong> View. As such, operations on the tree follow the normal procedure for operations on a graph. <code>create_object</code> creates a node but not any edges, not even the implied (“canonical”) edge from parent to child. <code>insert</code> creates the canonical edge. Until that edge exists, the object is not considered to be “alive”, so the lifespan of the edge effectively encodes the objects life. Following the create-populate-insert pattern, minimizes the number of events that need to be processed.</p>
<p>Having completed a single command, we can proceed in one of two directions — we can continue implementing commands for other objects in the tree, or we can implement matching <em>refresh</em> methods in <code>methods.py</code> for the completed object. <code>methods.py</code> also requires patterns which are used to match a path to a trace object, usually via <code>find_x_by_pattern</code> methods. The <code>refresh</code> methods may or may not rely on the <code>find_by</code> methods depending on whether the matching command needs parameters. For example, we may want to assume the <code>selected_thread</code> matches the current object in the view, in which case it can be used to locate that node, or we may want to force the method to match on the node if the trace object can be easily matched to the debugger object, or we may want to use the node to set <code>selected_thread</code>.</p>
<p>The concept of focus in the debugger is fairly complicated and a frequent source of confusion. In general, we use <em>selected</em> to represent the GUIs current focus, typically the node in the <strong>Model</strong> or associated views which the user has selected. In some sense, it represents the process, thread, or frame the user is interested in. It also may differ from the <em>highlighted</em> node, chosen by a single-click (versus a double-click which sets the <em>selection</em>). By contrast, the native debugger has its own idea of focus, which we usually describe as <em>current</em>. (This concept is itself complicated by distinctions between the <em>event</em> object, e.g. which thread the debugger broke on, and the <em>current</em> object, e.g. which thread is being inspected.) <em>Current</em> values are pushed “up” to Ghidras GUI from the native debugger; <em>selected</em> values are pushed “down” to the native debugger from Ghidra. To the extent possible, it makes sense to synchronize these values. In other words, in most cases, a new <em>selection</em> should force a change in the set of <em>current</em> objects, and an event signaling a change in the <em>current</em> object should alter the GUIs set of <em>selected</em> objects. (Of course, care needs to be taken not to make this a round-trip cycle.)</p>
<p><code>refresh</code> methods (and others) are often annotated in several ways. The <code>@REGISTRY.method</code> annotation makes the method available to the GUI. It specifies the <code>action</code> to be taken and the <code>display</code> that appears in the GUI pop-up menu. <em>Actions</em> may be purely descriptive or may correspond to built-in actions taken by the GUI, e.g. <code>refresh</code> and many of the control methods, such as <code>step_into</code>. Parameters for the methods may be annotated with <code>sch.Schema</code> (conventionally on the first parameter) to indicate the nodes to which the method applies, and with <code>ParamDesc</code> to describe the parameters type and label for pop-up dialogs. After retrieving necessary parameters, <code>refresh</code> methods invoke methods from <code>commands.py</code> wrapped in a transaction.</p>
<p>For <strong>drgn</strong>, we implemented <code>put</code>/<code>refresh</code> methods for threads, frames, registers (<code>putreg</code>), and local variables, then modules and sections, memory and regions, the environment, and finally processes. We also implemented <code>putmem</code> using the <strong>drgn</strong>s <code>read</code> API. <em>Symbols</em> was another possibility, but, for the moment, populating symbols seemed to expensive. Instead, <code>retrieve_symbols</code> was added to allow per-pattern symbols to be added. Unfortunately, the <strong>drgn</strong> API doesnt support wildcards, so eventually some other strategy will be necessary.</p>
<p>The remaining set of Python functions, <code>hooks.py</code>, comprises callbacks for various events sent by the native debugger. The current <strong>drgn</strong> code has no event system. A set of skeletal methods has been left in place as (a) we can use the single-step button as a stand-in for “update state”, and (b) some discussion exists in the <strong>drgn</strong> user forums regarding eventually implementing more control functionality. For anyone implementing <code>hooks.py</code>, the challenging logic resides in the event loop, particularly if there is a need to move back-and-forth between the debugger and a <em>repl</em>. Also, distinctions need to be made between control commands, which wait for events, and commands which rely on a callback but complete immediately. As a rule-of-thumb, we <em>push</em> to Ghidra, i.e. Ghidra issue requests asynchronously and the agent must update the trace database.</p>
</section>
<section id="revisiting-the-schema" class="level3">
<h3>Revisiting the schema</h3>
<p>At this point, revisiting and editing the schema may be called for. For example, for <strong>drgn</strong>, its not obvious that there can ever be more than one session, so it may be cleaner to embed <em>Processes</em> at the root. This, in turn, requires editing the <code>commands.py</code> and <code>methods.py</code> patterns. Similarly, as breakpoints are not supported, the breakpoint-related entries may safely be deleted.</p>
<p>In general, the schema can be structured however you like, but there are several details worth mentioning. Interfaces generally need to be respected for various functions in the GUI to work. Process, thread, frame, module, section, and memory elements can be named arbitrarily, but their interfaces must be named correctly. Additionally, the logic for finding objects in the tree is quite complicated. If elements need be traversed as part of the default search process, their containers must be tagged <code>canonical</code>. If attributes need to be traversed, their parents should have the interface <code>Aggregate</code>.</p>
<p>Each entry may have <code>elements</code> of the same type ordered by keys, and <code>attributes</code> of arbitrary type. The <code>element</code> entry describes the schema for all elements; the schema for attributes may be given explicitly using named <code>attribute</code> entries or defaulted using the unnamed <code>attribute</code> entry, typically <code>&lt;attribute schema="VOID"&gt;</code> or <code>&lt;attribute schema="ANY"&gt;</code>. The schema for any element in the <strong>Model</strong> View is visible using the hover, which helps substantially when trying to identify schema traversal errors.</p>
<p>Schema entries may be marked <code>hidden=yes</code> with the obvious result. Additionally, certain attribute names and schema have special properties. For example, <code>_display</code> defines the visible ID for an entry in the <strong>Model</strong> tree, and <code>ADDRESS</code> and <code>RANGE</code> mark attributes which are navigable.</p>
</section>
<section id="unit-tests" class="level3">
<h3>Unit tests</h3>
<p>The hardest part of writing unit tests is almost always getting the first test to run, and the easiest unit tests, as with the Python files, are those for <code>commands.py</code>. For <strong>drgn</strong>, as before, were using <strong>dbgeng</strong> as the pattern, but several elements had to be changed. Because the launchers execute a script, we need to amend the <code>runThrowError</code> logic (and, more specifically, the <code>execInPython</code> logic) in <a href="../../../Ghidra/Test/DebuggerIntegrationTest/src/test.slow/java/agent/drgn/rmi/AbstractDrgnTraceRmiTest.java"><code>AbstractDrgnTraceRmiTest</code></a> with a <code>ProcessBuilder</code> call that takes a script, rather than writing the script to stdin. While there, we can also trim out the unnecessary helper logic around items like breakpoints, watchpoints, etc. from all of the test classes.</p>
<p>JUnits for <code>methods.py</code> follow a similar pattern, but, again, getting the first one to run is often the most difficult. For <strong>drgn</strong>, weve had to override the timeouts in <code>waitForPass</code> and <code>waitForCondition</code>. After starting with hardcoded paths for the test target, we also had to add logic to re-write the <code>PREAMBLE</code> on-the-fly in <code>execInDrgn</code>. Obviously, with no real <code>hooks.py</code> logic, theres no need for <code>DrgnHooksTest</code>.</p>
<p>Of note, weve used the gdb <code>gcore</code> command to create a core dump for the tests. Both user- and kernel-mode require privileges to run the debugger, and, for testing, thats not ideal.</p>
</section>
<section id="documentation" class="level3">
<h3>Documentation</h3>
<p>The principal piece of documentation for all new debuggers is a description of the launchers. Right now, the <a href="../../../Ghidra/Debug/Debugger-rmi-trace/src/main/help/help/topics/TraceRmiConnectionManagerPlugin/TraceRmiLauncherServicePlugin.html"><code>TraceRmiLauncherServicePlugin.html</code></a> file in <code>Debug/Debugger-rmi-trace</code> contains all of this information. Detail to note: the <code>#@help</code> locations in the launchers themselves ought to match the HTML tags in the file, as should the launcher names.</p>
</section>
<section id="extended-features" class="level3">
<h3>Extended features</h3>
<p>Once everything else is done, it may be worth considering additional functionality specific to the debugger. This can be made available in either <code>commands.py</code> or <code>methods.py</code>. For <strong>drgn</strong>, weve added <code>attach</code> methods that allow the user to attach to additional programs.</p>
</section>
</section>
</section>
</body>
</html>

View file

@ -0,0 +1,224 @@
# Adding a debugger
This module walks you through an example of how to add a debugger agent to Ghidra.
It has no exercises and is certainly not the only way to implement an agent, but hopefully contains some useful pointers and highlights some pit-falls that you might encounter.
The example traces the implementation of an actual agent &mdash; the agent for *Meta*'s **drgn** debugger, which provides a scriptable, albeit read-only, interface to the running Linux kernel, as well as user-mode and core-dump targets.
## Debugger documentation
- Recommended reading: **drgn** (<https://github.com/osandov/drgn>)
- Also: **drgn (docs)** (<https://drgn.readthedocs.io/en/latest>)
## Anatomy of a Ghidra debugger agent
To support debugging on various platforms, the Ghidra debugger has *agents*, i.e. clients capable of receiving information from a native debugger and passing it to the Ghidra GUI.
They include the **dbgeng** agent that supports Windows debuggers, the **gdb** agent for gdb on a variery of platforms, the **lldb** agent for macOS and Linux, and the **jpda** agent for Java.
All but the last are written in Python 3, and all communicate with the GUI via a protobuf-based protocol described in [Debugger-rmi-trace](../../../Ghidra/Debug/Debugger-rmi-trace/src/main/proto/trace-rmi.proto).
At the highest level, each agent has four elements (ok, a somewhat arbitrary division, but...):
* [`debugger-launchers`](../../../Ghidra/Debug/Debugger-agent-drgn/data/debugger-launchers) &ndash; A set of launchers, often a mixture of `.bat`,`.sh`, and sometime `.py` scripts
* [`schema.xml`](../../../Ghidra/Debug/Debugger-agent-drgn/src/main/py/src/ghidradrgn/schema.xml) &ndash; An object-model schema. (While expressed in XML, this is not an "XML schema".)
* [`src/ghidradrgn`](../../../Ghidra/Debug/Debugger-agent-drgn/src/main/py/src/ghidradrgn) &ndash; Python files for architecture, commands, hooks, methods, and common utility functions
* [`build.gradle`](../../../Ghidra/Debug/Debugger-agent-drgn/build.gradle) &ndash; Build logic
Large portions of each are identical or similar across agents, so, as a general strategy, copying an existing agent and renaming all agent-specific variables, methods, etc. is not the worst plan of action. Typically, this leads to large chunks of detritus that need to be edited out late in the development process.
## drgn as an Example
### The first launcher &mdash; `local-drgn.sh`
The initial objective is to create a shell that sets up the environment variables for parameters we'll need and invokes the target.
For this project, I originally started duplicating the **lldb** agent and then switched to the **dbgeng** agent.
Why? The hardest part of writing an agent is getting the initial launch pattern correct.
**drgn** is itself written in Python.
While gdb and lldb support Python as scripting languages, their cores are not Python-based.
For these debuggers, the launcher runs the native debugger and instructs it to load our plugin, which is the agent.
The dbgeng agent inverts this pattern, i.e. the agent is a Python application that uses the **Pybag** package to access the native *kd* interface over COM.
**drgn** follows this pattern.
That said, a quick look at the launchers in the **dbgeng** project (under [`debugger-launchers`](../../../Ghidra/Debug/Debugger-agent-dbgeng/data/debugger-launchers)) shows `.bat` files, each of which calls a `.py` file in [`data/support`](../../../Ghidra/Debug/Debugger-agent-dbgeng/data/support).
As **drgn** is a Linux-only debugger, we need to convert the `.bat` examples to `.sh`.
Luckily, the conversion is pretty simple: most line annotations use `#` in place of `::` and environment variables are referenced using `$VAR` in place of `%VAR%`.
The syntax of the `.sh` is typical of any *\*nix* shell.
In addition to the shell script, a launcher include a metadata header to populate its menu and options dialog.
Annotations include:
* A `#!` line for the shell invocation
* The Ghidra license
* A `#@title` line for the launcher name
* A `#@desc`-annotated HTML description, as displayed in the launch dialog
* `#@menu-group` for organizing launchers
* `#@icon` for an icon
* `#@help` the help file and anchor
* Some number of `#@arg` variables, usually only one to name the executable image
* `#@args` specifies the remainder of the arguments, passed to a user-mode target if applicable
* Some number of `#@env` variables referenced by the Python code
While the **drgn** launcher does not use `@arg` or `@args`, there are plentiful examples
in the [**gdb** project](../../../Ghidra/Debug/Debugger-agent-gdb/data/debugger-launchers).
The `#@env` lines are composed of the variable name (usually in caps), its type, default value, a label for the dialog if the user need to be queried, and a description.
The syntax looks like:
* `#@env` *Name* `:` *Type* [ `!` ] `=` *DefaultValue* *Label* *Description*
where `!`, if present, indicates the option is required.
For **drgn**, invoking the `drgn` command directly saves us a lot of the work involved in getting the environment correct.
We pass it our Python launcher `local-drgn.py` instead of allowing it to call `run_interactive`, which does not return.
Instead, we created an instance of `prog` based on the parameters, complete the Ghidra-specific initialization, and call `run_interactive(prog)` ourselves.
The Python script needs to do the setup work for Ghidra and for **drgn**.
A good start is to try to implement a script that calls the methods for `connect`, `create`, and `start`, with `create` doing as little as possible initially.
This should allow you to work the kinks out of `arch.py` and `util.py`.
For this particular target, there are some interesting wrinkles surrounding the use of `sudo` (required for most targets) which complicate where wheels are installed (i.e. it is pretty easy to accidentally mix user-local and system `site-packages`).
Additionally, the `-E` parameter is required to ensure that the environment variable we defined get passed to the root environment.
In the cases where we use `sudo`, the first message printed in the interactive shell will be the request for the user's password.
### The schema
The schema, specified in `schema.xml`, provides a basic structure for Ghidra's **Model** View and allows Ghidra to identify and locate various interfaces that are used to populate the GUI.
For example, the *Memory* interface identifies the container for items with the interface *MemoryRegion*, which provide information used to fill the **Memory** View.
Among the important interfaces are *Process*, *Thread*, *Frame*, *Register*, *MemoryRegion*, *Module*, and *Section*.
These interfaces are "built into" Ghidra so that it can identify which objects provide specific information and commands.
For the purposes of getting started, it's easiest to clone the **dbgeng** schema and modify it as needed.
Again, this will require substantial cleanup later on, but, as schema errors are frequently subtle and hard to identify, revisiting is probably the better approach.
`MANIFEST.in` should be modfied to reflect the schema's path.
### The build logic
Similarly, `build.gradle` can essentially be cloned from **dbgeng**, with the appropriate change to `eclipse.project.name`.
For the most part, you need only apply the `distributableGhidraModule.gradle` and `hasPythonPackage.gradle` scripts.
If further customization is needed, consult other examples in the Ghidra project and Gradle's documentation.
Not perhaps directly a build logic item, but `pyproject.toml` should be modified to reflect the agent's version number (by convention, Ghidra's version number).
### The Python files
At this point, we can start actually implementing the **drgn** agent.
`arch.py` is usually a good starting point, as much of the initial logic depends on it.
For `arch.py`, the hard bit is knowing what maps to what.
The `language_map` converts the debugger's self-reported architecture to Ghidra's language set.
Ghidra's languages are mapped to a set of language-to-compiler maps, which are then used to map the debugger's self-reported language to Ghidra's compiler.
Certain combinations are not allowed because Ghidra has no concept of that language-compiler combination.
For example, x86 languages never map to `default`.
Hence, the need for a `x86_compiler_map`, which defaults to something else (in this case, `gcc`).
After `arch.py`, a first pass at `util.py` is probably warranted.
In particular, the version info is used early in the startup process.
A lot of this code is not relevant to our current project, but at a minimum we want to implement (or fake out) methods such as `selected_process`, `selected_thread`, and `selected_frame`.
In this example, there probably won't be more than one session or one process.
Ultimately, we'll have to decide whether we even want *Session* in the schema.
For now, we're defaulting session and process to 0, and thread to 1, as 0 is invalid for debugging the kernel.
(Later, it becomes obvious that the attached pid and `prog.main_thread().tid` make sense for user-mode debugging, and `prog.crashed_thread().tid` makes sense for crash dump debugging.)
With `arch.py` and `util.py` good to a first approximation, we would normally start implementing `put` methods in `commands.py` for various objects in the **Model** View, starting at the root of the tree and descending through the children.
Again, *Session* and *Process* are rather poorly-defined, so we skip them (leaving one each) and tackle *Threads*.
Typically, for each iterator in the debugger API, two commands get implemented &mdash; one internal method that does the actual work, e.g. `put_threads()` and one invokable method that wraps this method in a (potentialy batched) transaction, e.g. `ghidra_trace_put_threads()`.
The internal methods are meant to be called by other Python code, with the caller assumed to be responsible for setting up the transaction.
The `ghidra_trace`-prefixed methods are meant to be part of the custom CLI command set which the user can invoke and therefore should set up the transaction.
The internal method typically creates the path to the container using patterns for the container, individual keys, and the combination, e.g. `THREADS_PATTERN`, `THREAD_KEY_PATTERN`, and `THREAD_PATTERN`.
Patterns are built up from other patterns, going back to the root.
A trace object corresponding to the debugger object is created from the path and inserted into the trace database.
Once this code has been tested, attributes of the object can be added to the base object using `set_value`.
Attributes that are not primitives can be added using the pattern create-populate-insert, i.e. we call `create_object` with extensions to the path, populate the object's children, and call `insert` with the created object.
In many cases (particularly when populating an object's children is expensive), you may want to defer the populate step, effectively creating a placeholder that can be populated on-demand.
The downside of this approach, of course, is that *refresh* methods must be added to populate those nodes.
As an aside, it's probably worth noting the function of `create_object` and `insert`.
Objects in the trace are maintained in a directory tree, with links (and backlinks) allowed, whose visible manifestation is the **Model** View.
As such, operations on the tree follow the normal procedure for operations on a graph.
`create_object` creates a node but not any edges, not even the implied ("canonical") edge from parent to child.
`insert` creates the canonical edge.
Until that edge exists, the object is not considered to be "alive", so the lifespan of the edge effectively encodes the object's life.
Following the create-populate-insert pattern, minimizes the number of events that need to be processed.
Having completed a single command, we can proceed in one of two directions &mdash; we can continue implementing commands for other objects in the tree, or we can implement matching *refresh* methods in `methods.py` for the completed object.
`methods.py` also requires patterns which are used to match a path to a trace object, usually via `find_x_by_pattern` methods.
The `refresh` methods may or may not rely on the `find_by` methods depending on whether the matching command needs parameters.
For example, we may want to assume the `selected_thread` matches the current object in the view, in which case it can be used to locate that node, or we may want to force the method to match on the node if the trace object can be easily matched to the debugger object, or we may want to use the node to set `selected_thread`.
The concept of focus in the debugger is fairly complicated and a frequent source of confusion.
In general, we use *selected* to represent the GUI's current focus, typically the node in the **Model** or associated views which the user has selected.
In some sense, it represents the process, thread, or frame the user is interested in.
It also may differ from the *highlighted* node, chosen by a single-click (versus a double-click which sets the *selection*).
By contrast, the native debugger has its own idea of focus, which we usually describe as *current*.
(This concept is itself complicated by distinctions between the *event* object, e.g. which thread the debugger broke on, and the *current* object, e.g. which thread is being inspected.)
*Current* values are pushed "up" to Ghidra's GUI from the native debugger; *selected* values are pushed "down" to the native debugger from Ghidra.
To the extent possible, it makes sense to synchronize these values.
In other words, in most cases, a new *selection* should force a change in the set of *current* objects, and an event signaling a change in the *current* object should alter the GUI's set of *selected* objects.
(Of course, care needs to be taken not to make this a round-trip cycle.)
`refresh` methods (and others) are often annotated in several ways.
The `@REGISTRY.method` annotation makes the method available to the GUI.
It specifies the `action` to be taken and the `display` that appears in the GUI pop-up menu.
*Actions* may be purely descriptive or may correspond to built-in actions taken by the GUI, e.g. `refresh` and many of the control methods, such as `step_into`.
Parameters for the methods may be annotated with `sch.Schema` (conventionally on the first parameter) to indicate the nodes to which the method applies, and with `ParamDesc` to describe the parameter's type and label for pop-up dialogs.
After retrieving necessary parameters, `refresh` methods invoke methods from `commands.py` wrapped in a transaction.
For **drgn**, we implemented `put`/`refresh` methods for threads, frames, registers (`putreg`), and local variables, then modules and sections, memory and regions, the environment, and finally processes.
We also implemented `putmem` using the **drgn**'s `read` API.
*Symbols* was another possibility, but, for the moment, populating symbols seemed to expensive.
Instead, `retrieve_symbols` was added to allow per-pattern symbols to be added.
Unfortunately, the **drgn** API doesn't support wildcards, so eventually some other strategy will be necessary.
The remaining set of Python functions, `hooks.py`, comprises callbacks for various events sent by the native debugger.
The current **drgn** code has no event system.
A set of skeletal methods has been left in place as (a) we can use the single-step button as a stand-in for "update state", and (b) some discussion exists in the **drgn** user forums regarding eventually implementing more control functionality.
For anyone implementing `hooks.py`, the challenging logic resides in the event loop, particularly if there is a need to move back-and-forth between the debugger and a *repl*.
Also, distinctions need to be made between control commands, which wait for events, and commands which rely on a callback but complete immediately.
As a rule-of-thumb, we *push* to Ghidra, i.e. Ghidra issue requests asynchronously and the agent must update the trace database.
### Revisiting the schema
At this point, revisiting and editing the schema may be called for.
For example, for **drgn**, it's not obvious that there can ever be more than one session, so it may be cleaner to embed *Processes* at the root.
This, in turn, requires editing the `commands.py` and `methods.py` patterns.
Similarly, as breakpoints are not supported, the breakpoint-related entries may safely be deleted.
In general, the schema can be structured however you like, but there are several details worth mentioning.
Interfaces generally need to be respected for various functions in the GUI to work.
Process, thread, frame, module, section, and memory elements can be named arbitrarily, but their interfaces must be named correctly.
Additionally, the logic for finding objects in the tree is quite complicated.
If elements need be traversed as part of the default search process, their containers must be tagged `canonical`.
If attributes need to be traversed, their parents should have the interface `Aggregate`.
Each entry may have `elements` of the same type ordered by keys, and `attributes` of arbitrary type.
The `element` entry describes the schema for all elements; the schema for attributes may be given explicitly using named `attribute` entries or defaulted using the unnamed `attribute` entry, typically `<attribute schema="VOID">` or `<attribute schema="ANY">`.
The schema for any element in the **Model** View is visible using the hover, which helps substantially when trying to identify schema traversal errors.
Schema entries may be marked `hidden=yes` with the obvious result.
Additionally, certain attribute names and schema have special properties.
For example, `_display` defines the visible ID for an entry in the **Model** tree, and `ADDRESS` and `RANGE` mark attributes which are navigable.
### Unit tests
The hardest part of writing unit tests is almost always getting the first test to run, and the easiest unit tests, as with the Python files, are those for `commands.py`.
For **drgn**, as before, we're using **dbgeng** as the pattern, but several elements had to be changed.
Because the launchers execute a script, we need to amend the `runThrowError` logic (and, more specifically, the `execInPython` logic) in [`AbstractDrgnTraceRmiTest`](../../../Ghidra/Test/DebuggerIntegrationTest/src/test.slow/java/agent/drgn/rmi/AbstractDrgnTraceRmiTest.java) with a `ProcessBuilder` call that takes a script, rather than writing the script to stdin.
While there, we can also trim out the unnecessary helper logic around items like breakpoints, watchpoints, etc. from all of the test classes.
JUnits for `methods.py` follow a similar pattern, but, again, getting the first one to run is often the most difficult.
For **drgn**, we've had to override the timeouts in `waitForPass` and `waitForCondition`.
After starting with hardcoded paths for the test target, we also had to add logic to re-write the `PREAMBLE` on-the-fly in `execInDrgn`.
Obviously, with no real `hooks.py` logic, there's no need for `DrgnHooksTest`.
Of note, we've used the gdb `gcore` command to create a core dump for the tests.
Both user- and kernel-mode require privileges to run the debugger, and, for testing, that's not ideal.
### Documentation
The principal piece of documentation for all new debuggers is a description of the launchers.
Right now, the [`TraceRmiLauncherServicePlugin.html`](../../../Ghidra/Debug/Debugger-rmi-trace/src/main/help/help/topics/TraceRmiConnectionManagerPlugin/TraceRmiLauncherServicePlugin.html) file in `Debug/Debugger-rmi-trace` contains all of this information.
Detail to note: the `#@help` locations in the launchers themselves ought to match the HTML tags in the file, as should the launcher names.
### Extended features
Once everything else is done, it may be worth considering additional functionality specific to the debugger. This can be made available in either `commands.py` or `methods.py`.
For **drgn**, we've added `attach` methods that allow the user to attach to additional programs.

View file

@ -17,6 +17,7 @@ all: \
B2-Emulation.html \ B2-Emulation.html \
B3-Scripting.html \ B3-Scripting.html \
B4-Modeling.html \ B4-Modeling.html \
B5-AddingDebuggers.html \
README.html README.html
clean: clean:

View file

@ -8,5 +8,6 @@
class="advanced" href="B1-RemoteTargets.html">Remote Targets</a><a class="advanced" href="B1-RemoteTargets.html">Remote Targets</a><a
class="advanced" href="B2-Emulation.html">Emulation</a><a class="advanced" href="B2-Emulation.html">Emulation</a><a
class="advanced" href="B3-Scripting.html">Scripting</a><a class="advanced" href="B3-Scripting.html">Scripting</a><a
class="advanced" href="B4-Modeling.html">Modeling</a> class="advanced" href="B4-Modeling.html">Modeling</a><a
class="advanced" href="B5-AddingDebuggers.html">Adding Debuggers</a>
</header> </header>

View file

@ -74,6 +74,8 @@ GhidraClass/Debugger/B3-Scripting.html||GHIDRA||||END|
GhidraClass/Debugger/B3-Scripting.md||GHIDRA||||END| GhidraClass/Debugger/B3-Scripting.md||GHIDRA||||END|
GhidraClass/Debugger/B4-Modeling.html||GHIDRA||||END| GhidraClass/Debugger/B4-Modeling.html||GHIDRA||||END|
GhidraClass/Debugger/B4-Modeling.md||GHIDRA||||END| GhidraClass/Debugger/B4-Modeling.md||GHIDRA||||END|
GhidraClass/Debugger/B5-AddingDebuggers.html||GHIDRA||||END|
GhidraClass/Debugger/B5-AddingDebuggers.md||GHIDRA||||END|
GhidraClass/Debugger/Makefile||GHIDRA||||END| GhidraClass/Debugger/Makefile||GHIDRA||||END|
GhidraClass/Debugger/README.html||GHIDRA||||END| GhidraClass/Debugger/README.html||GHIDRA||||END|
GhidraClass/Debugger/README.md||GHIDRA||||END| GhidraClass/Debugger/README.md||GHIDRA||||END|