diff --git a/Ghidra/Features/Decompiler/src/decompile/cpp/architecture.cc b/Ghidra/Features/Decompiler/src/decompile/cpp/architecture.cc
index 8238afc9d4..cb96bbbbce 100644
--- a/Ghidra/Features/Decompiler/src/decompile/cpp/architecture.cc
+++ b/Ghidra/Features/Decompiler/src/decompile/cpp/architecture.cc
@@ -26,8 +26,8 @@
vector
+
+ Unions data-types are fully supported. The Decompiler does not automatically infer unions
+ when analyzing a function; it propagates them into the function from explicitly
+ annotated sources, like input parameters or global variables.
+
+ A union data-type, similarly to a structure, is made up of component data-types
+ called fields. But unlike a structure, a union's fields all share the same underlying
+ storage. When the union is applied to a variable, each field potentially describes the whole variable.
+ At any given point where the variable is read or written, a different field may be in effect, even if the
+ underlying data hasn't changed. The decompiler attempts to infer the particular field by following data-flow
+ to or from the point of use to determine which field best aligns with the specific operations being applied to the
+ variable. The name of this recovered field is then printed in Decompiler output using syntax similar to that
+ used for structure fields.
+
+ Depending on the number and variety of fields within the union, it may not be possible
+ to fully distinguish which field is being used in a specific context. In this situation,
+ the Decompiler chooses the first field from the list of best matches. The user has the
+ option of changing this choice with the Force Field action.
+
+ Typedef data-types are fully supported. The Decompiler does not automatically infer typedefs
+ when analyzing a function; it propagates them into the function from explicitly annotated sources.
+
+ A typedef is copy of another data-type but with an alternate name.
+ In most cases it can be used interchangeably with the data-type it copies.
+ In general, the Decompiler treats a typedef as a distinct data-type, and it will maintain its identify
+ when it is assigned to variables and is propagated through data-flow.
+
+ Ghidra supports a specific set of attributes that can be placed directly on a typedef
+ that then distinguish it from the data-type it copies. This allows Ghidra to support some
+ non-standard data-types, although the typedef and its copy are no longer interchangeable.
+ The decompiler supports the following typedef properties:
+
+ The Decompiler supports some specialized attributes that can be applied to pointer data-types, like offsets
+ and address spaces (See below). Ghidra implements these attributes on top of typedef data-types only. In
+ order to add attributes to pointers, a typedef of the underlying pointer data-type must be created first.
+ Attributes can then be placed directly on the typedef from the Data Type Manager window
+ (See Pointer-Typedef Settings).
+
+ An offset pointer points at a fixed offset relative to the start of its
+ underlying data-type. Typically the underlying data-type is a structure and the pointer points at a
+ specific field in the interior of the structure. But in general, the underlying data-type can be anything,
+ and the offset can point anywhere relative to that data-type, including either before or after.
+
+ An offset pointer is defined with all the same properties of a normal pointer. It has an underlying
+ data-type and a size. On top of this an offset is specified
+ as an integer attribute on the pointer (typedef). This is the number of bytes that need to be
+ added to the start of the underlying data-type to obtain the address actually being pointed at.
+
+ Because the underlying data-type does not start directly at the address
+ contained in the offset pointer, one can also refer to the offset pointer's
+ direct data-type, i.e. the data-type that is
+ directly at the address contained in the pointer. If the pointer's offset is positive (and small),
+ the direct data-type will generally be that of a field of the underlying
+ data-type. If the offset is bigger than the size of the underlying data-type or is negative,
+ the direct data-type will be undefined.
+
+ Offset pointers occur in code where the compiler has maintained knowledge of the position of
+ an underlying data-type relative to a pointer, even if the pointer no longer points directly at the data-type.
+ Because of this, the code may still access fields of the underlying data-type through the pointer.
+ Annotating a variable with an offset pointer allows the Decompiler to recover these accesses.
+
+ Within the Decompiler's output, the token
+ If the offset pointer appears in Decompiler output without the
+ An address space pointer is a normal pointer data-type with a specific
+ address space associated to it (See Address Space). Its created by setting
+ the Address Space attribute on a typedef of a pointer. The attribute value is the name of the specific
+ address space.
+
+ Address space pointers are useful, when a program architecture supports more than one address space
+ containing addressable memory, such as separate
+ The Decompiler will automatically propagate an address space pointer data-type from parameters and
+ other annotated variables associated with a function. Any constant that the pointer reaches via propagation
+ is assumed to point into the address space associated with the pointer. The correct symbol can then
+ be looked up, further informing the Decompiler output.
+
@@ -1207,18 +1351,20 @@
The volatile mutability setting indicates that values within
the memory region may change unexpectedly, even if the code currently executing does not directly
- write to it. If a volatile variable is accessed in a function being analyzed by the Decompiler,
- each specific access is replaced with a built-in function call, which prevents constant propagation
- and other transforms across the access. The built-in functions are named based on
- whether the access is a read or write and on the size
- of the access. Within the Decompiler output, the first parameter to a built-in function is a symbol
- indicating the volatile variable. The function returns a value in the case of a volatile read or
- takes a second parameter in the case of a volatile write.
+ write to it. Accessing a variable within a volatile region, either reading or writing, can have other
+ side-effects on the machine state, and it cannot in general be treated as normal variable.
+ If a volatile variable is accessed in a function being analyzed by the Decompiler,
+ each access is expressed as a copy statement on its own line, separated from other expressions,
+ so that the its position within the code and any sequence of accesses is clearly indicated.
+ Any access, either read or write, will always be displayed, even if the value is not directly
+ used by the function. The token representing the variable will be displayed using the
+ Special color, highlighting that the access is volatile
+ (See Color for <token>).
diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html
index 8f0a6ba0ff..c8fb1f1e28 100644
--- a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html
+++ b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html
@@ -364,6 +364,8 @@
Types - names of data-types in variable declarations and casts
diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html
index f70b709740..40eec57c0f 100644
--- a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html
+++ b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html
@@ -18,7 +18,7 @@
function in the Code Browser, then select the
Exports the decompiled result of the current function to a file. A file chooser
@@ -265,7 +265,7 @@
Creates a new Snapshot window. The Snapshot window
@@ -282,7 +282,7 @@
Triggers a re-decompilation of the current function displayed in the window.
@@ -310,7 +310,7 @@
Copies the currently selected text in the Decompiler window to the clipboard.
@@ -349,12 +349,15 @@
window tool/title bar.
- Generate an data flow graph based upon the results in the active Decompiler window,
+ Generate a data flow graph based upon the results in the active Decompiler window,
and render it using the current Graph Service.
ADJ
is used to indicate that the code is
+ accessing the underlying data-type through the offset pointer. The token uses functional syntax
+ to indicate the particular offset pointer. Then, once the ADJ
token is
+ applied, additional pointer syntax is used, i.e. ->
, to indicate what part
+ of the underlying data-type is being accessed.
+ ADJ
token being
+ applied, it is being treated as if it were a normal pointer to its direct
+ data-type. This generally indicates the pointer is being used to access data outside the
+ underlying data-type.
+ code
and data
address spaces.
+ For a program and a specific section of its code that manipulates a pointer, it may not be easy to determine
+ which address space is being referred to. Address space pointers provide an additional annotation mechanism
+ to help the decompiler identify the correct address space for a pointer in context.
+ (no field)
may also be available, which if selected, indicates that
+ the Decompiler should treat the read or write as accessing the union as a whole.
+ Its also possible that no other field is possible but the current one. In which case, the dialog will
+ display a warning message and contain no drop-down menu.
+
+
ADJ
is used to indicate that the code is
+ accessing the underlying data-type through the offset pointer. The token uses functional syntax
+ to indicate the particular offset pointer. Then, once the ADJ
token is
+ applied, additional pointer syntax is used, i.e. ->
, to indicate what part
+ of the underlying data-type is being accessed.
+
+ ADJ(structoffptr)->field1 = 2; // Accessing the underlying structure's field
+ iVar1 = *ADJ(intoffptr); // Accessing the underlying integer data-type
+ ADJ(arrayoffptr)[4] = iVar2; // Accessing the underlying array
+
+ ADJ
token being
+ applied, it is being treated as if it were a normal pointer to its direct
+ data-type. This generally indicates the pointer is being used to access data outside the
+ underlying data-type.
+ code
and data
address spaces.
+ For a program and a specific section of its code that manipulates a pointer, it may not be easy to determine
+ which address space is being referred to. Address space pointers provide an additional annotation mechanism
+ to help the decompiler identify the correct address space for a pointer in context.
+
- X = read_volatile_2(SREG);
- write_volatile_1(DAT_mem_002b,0x20);
+ X = SREG; // Reading volatile SREG
+ DAT_mem_002b = 0x20; // Writing volatile DAT
- �icon from the tool bar, or the
+ icon from the tool bar, or the
Decompile option from the
Window menu in the tool.
- �or selecting
+ or selecting
Decompile from the Window menu in the tool
brings up the main window. The main window always displays the function
at the current address within the Code Browser and follows as the user navigates
@@ -153,7 +153,7 @@
Pressing the
- �icon
+ icon
in any Decompiler window's toolbar causes a Snapshot window
to be created, which shows decompilation of the same function.
Unlike the main window however, the Snapshot window
@@ -240,7 +240,7 @@
- �- button
+ - button
- �- button
+ - button
- �- button
+ - button
- �- button
+ - button
+
Double-clicking a '{' or '}' token, causes the window to navigate to the matching brace within the window. The cursor is set and the window view is adjusted if necessary to ensure that the matching brace is visible. +
++ Braces may also be navigated via the keyboard.
-- Braces may also be navigated - via the keyboard. -
-@@ -511,16 +513,80 @@
- All the actions described in this section can be activated from the menu that pops up + Almost all the actions described in this section can be activated from the menu that pops up when right-clicking on a token within the Decompiler window. The pop-up menu is context sensitive and the type of token in particular (see Display) determines what actions are available. The token clicked provides a local context for the action and may be used to pinpoint the exact - variable or operation affected. + variable or operation affected. +
++ Many of the actions have have a key binding, which, if it exists, will be listed in the pop-up menu. + If the key binding is used to activate the action, local context is determined by the current cursor + position. Key bindings can be reassigned from the + Key Bindings + section of the Tool Options Dialog.
++ Create a pointer with an offset into its underlying data-type and apply it to the + selected variable. +
++ The action is available on variable tokens that the Decompiler has already determined are pointers, and + is designed to create an offset pointer, which is a typedef with its + component offset attribute set (See Offset Pointers). It + brings up a dialog that allows the user to select: +
++ If the Decompiler determines that the selected variable points into a known structure, + this information is used to prepopulate the dialog, with the structure used as the + underlying data-type. +
++ Clicking OK in the dialog immediately creates the new pointer data-type and attaches it to the selected variable, + triggering a new decompilation. The annotation may reveal new field accesses or other markup associated with the + underlying data-type, which will be immediately visible once decompilation completes. +
++ If the user selects the name of an offset pointer that already exists and its other properties match the dialog, + the existing pointer will be used to complete the action, rather than creating a new data-type. If an + unrelated data-type already exists with the selected name, the action cannot be completed and the dialog + will display a warning. +
+![]() |
++ |
---|---|
+ If the offset pointer already exists but its offset does not match the value specified + in the dialog, the user is given the option of changing the existing data-type's offset. Confirming + this change alters the existing data-type, which may affect analysis of other unrelated parts of + the program that use it. + |
+ Once an offset pointer exists, either by invoking this action or by creating it manually + (see Pointer-Typedef Settings), + it can be applied to variables like any other data-type, either through the Listing window or using the + Retype Variable action in the Decompiler window. +
+- The command brings up a dialog where a search pattern can entered as a raw string or regular expression. + The command brings up a dialog where a search pattern can be entered as a raw string or regular expression. The search is performed directly on the text of the decompiled function, from the current cursor; forward each time the Next button is hit, or backward for the Previous button. Any match is highlighted in the window, @@ -812,24 +878,58 @@
+ Force the Decompiler to assume a specific read or write to a variable, with a union data-type, + is referring to a user selected field. +
++ A union data-type overlays multiple other data-type interpretations + (fields) on top of one variable; any one of which may be active at + a point where the variable is accessed. The Decompiler automatically selects which field it thinks is being + referenced (See Unions). But this action allows the user to force + the Decompiler to use a particular field. +
+
+ The cursor must be on the token currently representing the union field to be changed.
+ The action brings up a dialog with a drop-down menu of the possible fields that can be forced.
+ A field must be compatible with the selected access or it will not be listed in the menu. This usually
+ means that the data-type of the field must be the same size as the read or write being performed.
+ The menu option (no field)
may also be available, which if selected, indicates that
+ the Decompiler should treat the read or write as accessing the union as a whole.
+ Its also possible that no other field is possible but the current one. In which case, the dialog will
+ display a warning message and contain no drop-down menu.
+
+ Selecting a menu option and clicking OK completes the action. Subsequent decompilation should show + the new field at the selected access point, and the data-type associated with the selected field will + likely propagate further across the function, possibly triggering substantial changes to the output. +
++ If there is more than one access to the same variable with a union data-type, + each representative field token can be forced to a new field individually. +
+- These actions are available from the keyboard. + These actions are available from the keyboard, with no equivalent pop-up menu entry. Shift-Open Bracket will go to the previous enclosing open brace. Shift-Close Bracket will go to the - next enclosing closing brace. These key bindings can be changed via the tool options. + next enclosing closing brace. These key bindings can be changed via the + Tool Options Dialog.
- Paired braces can also be navigated by double-clicking. + Paired braces can also be navigated by double-clicking.
Display locations in the Program that reference the function or - global variable - associated with the token under the cursor. Results are collected in a + global variable associated with the token under the cursor. + Results are collected in a Location References Dialog, which displays a row for each reference with its address and other context.
@@ -1024,7 +1124,7 @@Display locations in the Program that reference the address @@ -1073,7 +1173,6 @@
-- A label can be removed by triggering this action while the corresponding label token is + A label can be removed by triggering this action while the corresponding label token is under the cursor. For more on removing labels, see - Removing a Label. + Removing a Label.
The change will be immediately visible across all references to the label @@ -1092,7 +1191,6 @@