diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerAnnotations.html b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerAnnotations.html index 78cdcdb43c..7bc42707b6 100644 --- a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerAnnotations.html +++ b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerAnnotations.html @@ -1,10 +1,10 @@ - + Program Annotations Affecting the Decompiler - + diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerConcepts.html b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerConcepts.html index 23afd21179..a1ac4908f9 100644 --- a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerConcepts.html +++ b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerConcepts.html @@ -1,10 +1,10 @@ - + Decompiler Concepts - + @@ -182,7 +182,7 @@

-

Table . P-code Operations

+

Table . P-code Operations

diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerIntro.html b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerIntro.html index df27356c0c..d6d603e624 100644 --- a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerIntro.html +++ b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerIntro.html @@ -1,10 +1,10 @@ - +Decompiler - + @@ -64,7 +64,7 @@
  • Press the -  icon +  icon in the tool bar, or
  • diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html index 1eb7f91fcf..2742292129 100644 --- a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html +++ b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerOptions.html @@ -1,10 +1,10 @@ - + Decompiler Options - + @@ -38,14 +38,14 @@ Decompiler - lists General Options that affect the engine behavior.
  • -   +   -  Analysis - lists Analysis Options that affect the Decompiler's transformation process. +  Analysis - lists Analysis Options that affect the Decompiler's transformation process.
  • -   +   -  Display - lists Display Options that affect the final presentation of Decompiler output. +  Display - lists Display Options that affect the final presentation of Decompiler output.
  • diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html index 7697d2d955..bfa016a42e 100644 --- a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html +++ b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/DecompilerWindow.html @@ -1,10 +1,10 @@ - +Decompiler Window - + @@ -18,7 +18,7 @@ function in the Code Browser, then select the -  icon from the tool bar, or the +  icon from the tool bar, or the Decompile option from the Window menu in the tool.

    @@ -91,7 +91,7 @@ Initially pressing -  or selecting +  or selecting Decompile from the Window menu in the tool brings up the main window. The main window always displays the function at the current address within the Code Browser and follows as the user navigates @@ -153,7 +153,7 @@ Pressing the -  icon +  icon in any Decompiler window's toolbar causes a Snapshot window to be created, which shows decompilation of the same function. Unlike the main window however, the Snapshot window @@ -240,7 +240,7 @@

    -  - button +  - button

    Exports the decompiled result of the current function to a file. A file chooser @@ -265,7 +265,7 @@

    -  - button +  - button

    Creates a new Snapshot window. The Snapshot window @@ -282,7 +282,7 @@

    -  - button +  - button

    Triggers a re-decompilation of the current function displayed in the window. @@ -310,7 +310,7 @@

    -  - button +  - button

    Copies the currently selected text in the Decompiler window to the clipboard. diff --git a/GhidraDocs/languages/html/additionalpcode.html b/GhidraDocs/languages/html/additionalpcode.html index f3de3eb25a..86dec10094 100644 --- a/GhidraDocs/languages/html/additionalpcode.html +++ b/GhidraDocs/languages/html/additionalpcode.html @@ -1,10 +1,10 @@ - + Additional P-CODE Operations - + @@ -16,9 +16,9 @@

    - - + +
    Additional P-CODE Operations
    -Prev   Next +Prev   Next
    @@ -437,15 +437,15 @@ to SLEIGH bitrange syntax such as out - - + + - + - +
    -Prev   Next +Prev   Next
    Pseudo P-CODE Operations Pseudo P-CODE Operations  Home Syntax Reference Syntax Reference
    diff --git a/GhidraDocs/languages/html/pcodedescription.html b/GhidraDocs/languages/html/pcodedescription.html index 760646ac02..c73f909037 100644 --- a/GhidraDocs/languages/html/pcodedescription.html +++ b/GhidraDocs/languages/html/pcodedescription.html @@ -1,10 +1,10 @@ - + P-Code Operation Reference - + @@ -16,9 +16,9 @@ P-Code Operation Reference -Prev  -  - Next +Prev  +  + Next @@ -121,7 +121,7 @@ input0 and output must be the same.

    This instruction loads data from a dynamic location into the output -variable by dereferencing a pointer. The “pointer” comes in two +variable by dereferencing a pointer. The “pointer” comes in two pieces. One piece, input1, is a normal variable containing the offset of the object being pointed at. The other piece, input0, is a constant indicating the space into which the offset applies. The data in input1 @@ -133,7 +133,7 @@ loaded by this instruction is determined by the size of the output variable. It is easy to confuse the address space of the output and input1 variables and the Address Space represented by the ID, which could all be different. Unlike many programming models, there are -multiple spaces that a “pointer” can refer to, and so an extra ID is +multiple spaces that a “pointer” can refer to, and so an extra ID is required.

    @@ -194,7 +194,7 @@ correct byte offset into the space. This instruction is the complement of LOAD. The data in the variable input2 is stored at a dynamic location by dereferencing a pointer. As -with LOAD, the “pointer” comes in two +with LOAD, the “pointer” comes in two pieces: a space ID part, and an offset variable. The size of input1 must match the address space specified by the ID, and the amount of data stored is determined by the size of input2. @@ -264,7 +264,7 @@ of the current machine instruction. This allows branching within the operations forming a single instruction. For example, if the BRANCH occurs as the pcode operation with index 5 for the instruction, it can branch to operation -with index 8 by specifying a constant destination “address” of +with index 8 by specifying a constant destination “address” of 3. Negative constants can be used for backward branches.

    @@ -1821,7 +1821,7 @@ sign-extended to the desired size. This is an unsigned integer division operation. Divide input0 by input1, truncating the result to the nearest integer, and store the result in output. Both inputs and output must be the same size. There -is no handling of division by zero. To simulate a processor’s handling +is no handling of division by zero. To simulate a processor’s handling of a division-by-zero trap, other operations must be used before the INT_DIV.

    @@ -1923,7 +1923,7 @@ This is a signed integer division operation. The resulting integer is the one closest to the rational value input0/input1 but which is still smaller in absolute value. Both inputs and output must be the same size. There is no handling of division by zero. To simulate a -processor’s handling of a division-by-zero trap, other operations must +processor’s handling of a division-by-zero trap, other operations must be used before the INT_SDIV.

    @@ -3024,15 +3024,15 @@ Input0 and output can be different sizes. - - + + - + - +
    -Prev   Next +Prev   Next
    P-Code Reference Manual P-Code Reference Manual  Home Pseudo P-CODE Operations Pseudo P-CODE Operations
    diff --git a/GhidraDocs/languages/html/pcoderef.html b/GhidraDocs/languages/html/pcoderef.html index 755bbda0c3..0f26ca1c75 100644 --- a/GhidraDocs/languages/html/pcoderef.html +++ b/GhidraDocs/languages/html/pcoderef.html @@ -1,10 +1,10 @@ - + P-Code Reference Manual - + @@ -13,9 +13,9 @@ - - - + +
    P-Code Reference Manual
       Next +   Next
    @@ -195,7 +195,7 @@ to as raw p-code. Raw p-code can be u instruction execution and generally follows the same control-flow, although it may add some of its own internal control-flow. The subset of opcodes that can occur in raw p-code is described in -the section called “P-Code Operation Reference” and in the section called “Pseudo P-CODE Operations”, making up +the section called “P-Code Operation Reference” and in the section called “Pseudo P-CODE Operations”, making up the bulk of this document.

    @@ -209,7 +209,7 @@ opcodes. Two of these, MULTIEQUAL and INDIRECT, are specific to the graph construction process, but other opcodes can be introduced during subsequent analysis and transformation of a graph and help hold recovered data-type relationships. -All of the new opcodes are described in the section called “Additional P-CODE Operations”, none of which can occur +All of the new opcodes are described in the section called “Additional P-CODE Operations”, none of which can occur in the original raw p-code translation. Finally, a few of the p-code operators, CALL, CALLIND, and RETURN, @@ -319,7 +319,7 @@ its opcode. For almost all p-code operations, only the output varnode can have its value modified; there are no indirect effects of the operation. The only possible exceptions are pseudo operations, -see the section called “Pseudo P-CODE Operations”, which are sometimes necessary when there +see the section called “Pseudo P-CODE Operations”, which are sometimes necessary when there is incomplete knowledge of an instruction's behavior.

    @@ -342,7 +342,7 @@ The list of possible opcodes are similar to many RISC based instruction sets. The effect of each opcode is described in detail in the following sections, and a reference table is given -in the section called “Syntax Reference”. In general, the size or +in the section called “Syntax Reference”. In general, the size or precision of a particular p-code operation is determined by the size of the varnode inputs or output, not by the opcode.

    @@ -353,15 +353,15 @@ of the varnode inputs or output, not by the opcode.
    - - - + + - - - + + +
       Next +   Next
       P-Code Operation Reference   P-Code Operation Reference
    diff --git a/GhidraDocs/languages/html/pseudo-ops.html b/GhidraDocs/languages/html/pseudo-ops.html index 00dc0da687..bbcc9f6516 100644 --- a/GhidraDocs/languages/html/pseudo-ops.html +++ b/GhidraDocs/languages/html/pseudo-ops.html @@ -1,10 +1,10 @@ - + Pseudo P-CODE Operations - + @@ -16,9 +16,9 @@ Pseudo P-CODE Operations -Prev  -  - Next +Prev  +  + Next @@ -104,7 +104,7 @@ parameter. Exact details are processor and specification dependent. Ideally, the output parameter is determined by the input parameters, and no variable is affected except the output parameter. But this is no longer a strict requirement, side-effects are possible. -Analysis should generally treat these instructions as a “black-box” which +Analysis should generally treat these instructions as a “black-box” which still have normal data-flow and can be manipulated symbolically.

    @@ -225,15 +225,15 @@ not modeled in these cases, so the operator serves as a placeholder to allow ana - - + + - + - +
    -Prev   Next +Prev   Next
    P-Code Operation Reference P-Code Operation Reference  Home Additional P-CODE Operations Additional P-CODE Operations
    diff --git a/GhidraDocs/languages/html/reference.html b/GhidraDocs/languages/html/reference.html index d009fef09d..0212d3ec62 100644 --- a/GhidraDocs/languages/html/reference.html +++ b/GhidraDocs/languages/html/reference.html @@ -1,10 +1,10 @@ - + Syntax Reference - + @@ -15,9 +15,9 @@ Syntax Reference -Prev  -  -  +Prev  +  + 
    @@ -515,14 +515,14 @@ - - +Prev  + + - + - +
    -Prev   Â Â 
    Additional P-CODE Operations Additional P-CODE Operations  Home Â 
    diff --git a/GhidraDocs/languages/html/sleigh.html b/GhidraDocs/languages/html/sleigh.html index afd7e24e6f..953a597d27 100644 --- a/GhidraDocs/languages/html/sleigh.html +++ b/GhidraDocs/languages/html/sleigh.html @@ -1,21 +1,21 @@ - + SLEIGH - + - +

    -1. Introduction to P-Code

    +1. Introduction to P-Code

    Although p-code is a distinct language from SLEIGH, because a major purpose of SLEIGH is to specify the translation from machine code to p-code, this document serves as a primer for p-code. The key concepts and terminology are presented in this section, and more detail is -given in Section 7.7, “The Semantic Section”. There is also a complete set +given in Section 7.7, “The Semantic Section”. There is also a complete set of tables which list syntax and descriptions for p-code operations in the Appendix.

    @@ -179,9 +179,9 @@ general purpose processor. Code for different processors can be translated in a straightforward manner into p-code, and then a single suite of analysis software can be used to do data-flow analysis and decompilation. In this way, the analysis software -becomes retargetable, and it isn’t necessary to +becomes retargetable, and it isn’t necessary to redesign it for each new processor being analyzed. It is only -necessary to specify the translation of the processor’s instruction +necessary to specify the translation of the processor’s instruction set into p-code.

    @@ -221,7 +221,7 @@ respectively.

    -1.1. Address Spaces

    +1.1. Address Spaces

    An address space for p-code is a generalization of the indexed memory (RAM) that a typical processor has access to, and @@ -261,7 +261,7 @@ Typically, a processor can be modeled with only two spaces, a ram address space that represents the main memory accessible to the processor via its data-bus, and a register address space that is used to -implement the processor’s registers. However, the specification +implement the processor’s registers. However, the specification designer can define as many address spaces as needed.

    @@ -272,14 +272,14 @@ semantics into individual p-code operations. It is called the unique space. There is also a special address space, called the const space, used as a placeholder for constant operands of p-code instructions. For the most -part, a SLEIGH specification doesn’t need to be aware of this space, +part, a SLEIGH specification doesn’t need to be aware of this space, but it can be used in certain situations to force values to be interpreted as constants.

    -1.2. Varnodes

    +1.2. Varnodes

    A varnode is the unit of data manipulated by p-code. It is simply a contiguous sequence of bytes in some address @@ -305,7 +305,7 @@ forces an interpretation on each varnode that it uses, as either an integer, a floating-point number, or a boolean value. In the case of an integer, the varnode is interpreted as having a big endian or little endian encoding, depending on the specification (see -Section 4.1, “Endianess Definition”). Certain instructions +Section 4.1, “Endianess Definition”). Certain instructions also distinguish between signed and unsigned interpretations. For a signed integer, the varnode is considered to have a standard twos complement encoding. For a boolean interpretation, the varnode must be @@ -322,7 +322,7 @@ must be provided and enforced by the specification designer.

    -1.3. Operations

    +1.3. Operations

    P-code is intended to emulate a target processor by substituting a sequence of p-code operations for each machine instruction. Thus every @@ -352,7 +352,7 @@ general purpose processor instruction sets. They break up into groups.

    -

    Table 1. P-code Operations

    +

    Table 1. P-code Operations

    @@ -414,7 +414,7 @@ general purpose processor instruction sets. They break up into groups.

    -We postpone a full discussion of the individual operations until Section 7.7, “The Semantic Section”. +We postpone a full discussion of the individual operations until Section 7.7, “The Semantic Section”.

    @@ -423,15 +423,15 @@ We postpone a full discussion of the individual operations until
    - - - + + - - - + + +
       Next +   Next
       2. Basic Specification Layout   2. Basic Specification Layout
    diff --git a/GhidraDocs/languages/html/sleigh_constructors.html b/GhidraDocs/languages/html/sleigh_constructors.html index 7582a38cef..0613c35bb7 100644 --- a/GhidraDocs/languages/html/sleigh_constructors.html +++ b/GhidraDocs/languages/html/sleigh_constructors.html @@ -1,24 +1,24 @@ - -7. Constructors + +7. Constructors - + - - + +

    -7. Constructors

    +7. Constructors

    Fields are the basic building block for family symbols. The mechanisms for building up from fields to the @@ -56,11 +56,11 @@ to think of a constructor as a kind of table in and of itself. But it is only the table that has an actual family symbol identifier associated with it. Most of this chapter is devoted to describing how to define a single constructor. The issues involved in combining -multiple constructors into a single table are addressed in Section 7.8, “Tables”. +multiple constructors into a single table are addressed in Section 7.8, “Tables”.

    -7.1. The Five Sections of a Constructor

    +7.1. The Five Sections of a Constructor

    A single complex statement in the specification file describes a constructor. This statement is always made up of five distinct @@ -92,12 +92,12 @@ in turn.

    -7.2. The Table Header

    +7.2. The Table Header

    Every constructor must be part of a table, which is the element with an actual family symbol identifier associated with it. So each constructor starts with the identifier of the table it belongs to -followed by a colon ‘:’. +followed by a colon ‘:’.

     mode1:           ...
    @@ -122,18 +122,18 @@ identifier.
     The identifier instruction is actually reserved
     for the root table, but should not be used in the table header as the
     SLEIGH parser uses the blank identifier to help distinguish assembly
    -mnemonics from operands (see Section 7.3.1, “Mnemonic”).
    +mnemonics from operands (see Section 7.3.1, “Mnemonic”).
     

    -7.3. The Display Section

    +7.3. The Display Section

    The display section consists of all characters -after the table header ‘:’ up to the SLEIGH -keyword is. The section’s primary +after the table header ‘:’ up to the SLEIGH +keyword is. The section’s primary purpose is to assign disassembly display meaning to the -constructor. The section’s secondary purpose is to define local +constructor. The section’s secondary purpose is to define local identifiers for the pieces out of which the constructor is being built. Characters in the display section are treated as literals with the following exceptions. @@ -151,7 +151,7 @@ the following exceptions.

  • - The character ‘^’ has special meaning. + The character ‘^’ has special meaning.
  • White space is trimmed from the beginning and end of the section. @@ -163,13 +163,13 @@ the following exceptions.

    -In particular, all punctuation except ‘^’ loses its special +In particular, all punctuation except ‘^’ loses its special meaning. Those identifiers that are not treated as literals are considered to be new, initially undefined, family symbols. We refer to these new symbols as the operands of the constructor. And for root constructors, these operands frequently correspond to the natural assembly operands. Thinking of it as a family symbol, the -constructor’s display meaning becomes the string of literals itself, +constructor’s display meaning becomes the string of literals itself, with each identifier replaced with the display meaning of the symbol corresponding to that identifier.

    @@ -182,11 +182,11 @@ mode1: ( op1 ),op2 is ... In the above example, a constructor for table mode1 is being built out of two pieces, symbol op1 and -symbol op2. The characters ‘(‘, ’)’, and ‘,’ +symbol op2. The characters ‘(‘, ’)’, and ‘,’ become literal parts of the disassembly display for symbol mode1. After the display strings for op1 and op2 are found, they are inserted into the -string of literals, forming the constructor’s display string. The +string of literals, forming the constructor’s display string. The white space characters surrounding the op1 identifier are preserved as part of this string.

    @@ -198,7 +198,7 @@ but only their identifiers are established in the display section.

    -7.3.1. Mnemonic

    +7.3.1. Mnemonic

    If the constructor is part of the root instruction table, the first string of characters in the display section that does not contain @@ -212,8 +212,8 @@ if it is legal.

    -In the above example, the string “var1” is treated as a symbol -identifier, but the string “and” is considered to be the mnemonic of +In the above example, the string “var1” is treated as a symbol +identifier, but the string “and” is considered to be the mnemonic of the instruction.

    @@ -230,10 +230,10 @@ no such requirement.

    -7.3.2. The '^' character

    +7.3.2. The '^' character

    -The ‘^’ character in the display section is used to separate -identifiers from other characters where there shouldn’t be white space +The ‘^’ character in the display section is used to separate +identifiers from other characters where there shouldn’t be white space in the disassembly display. This can be used in any manner but is usually used to attach display characters from a local symbol to the literal characters of the mnemonic. @@ -244,7 +244,7 @@ literal characters of the mnemonic.

    -In the above example, “bra” is treated as literal characters in the +In the above example, “bra” is treated as literal characters in the resulting display string followed immediately, with no intervening spaces, by the display string of the local symbol cc. Thus the whole constructor actually @@ -253,39 +253,39 @@ identifiers cc, op2.

    -If the ‘^’ is used as the first (non-whitespace) character in the +If the ‘^’ is used as the first (non-whitespace) character in the display section of a base constructor, this inhibits the first identifier in the display from being considered the mnemonic, as -described in Section 7.3.1, “Mnemonic”. This allows +described in Section 7.3.1, “Mnemonic”. This allows specification of less common situations, where the first part of the mnemonic, rather than perhaps a later part, needs to be considered as -an operand. An initial ‘^’ character can also facilitate certain +an operand. An initial ‘^’ character can also facilitate certain recursive constructions.

    -7.4. The Bit Pattern Section

    +7.4. The Bit Pattern Section

    Syntactically, this section comes between the keyword is and the delimiter for the -following section, either an ‘{‘ or an ‘[‘. The bit pattern +following section, either an ‘{‘ or an ‘[‘. The bit pattern section describes a -constructor’s pattern, the subset of possible +constructor’s pattern, the subset of possible instruction encodings that the designer wants to match the constructor being defined.

    -7.4.1. Constraints

    +7.4.1. Constraints

    The patterns required for processor specifications can almost always be described as a mask and value pair. Given a specific instruction encoding, we can decide if the encoding matches our pattern by looking at just the bits specified by the mask and seeing if they match a specific value. The fields, as -defined in Section 6.1, “Defining Tokens and Fields”, typically give us +defined in Section 6.1, “Defining Tokens and Fields”, typically give us our masks. So to construct a pattern, we can simply require that the field take on a specific value, as in the example below.

    @@ -294,9 +294,9 @@ field take on a specific value, as in the example below.

    Assuming the symbol opcode was defined as a field, this says that a -root constructor with mnemonic “halt” matches any instruction where +root constructor with mnemonic “halt” matches any instruction where the bits defining this field have the value 0x15. The equation -“opcode=0x15” is called a constraint. +“opcode=0x15” is called a constraint.

    The standard bit encoding of the integer is used when restricting the @@ -311,13 +311,13 @@ field.

    -7.4.2. The '&' and '|' Operators

    +7.4.2. The '&' and '|' Operators

    More complicated patterns are built out of logical operators. The meaning of these are fairly straightforward. We can force two or more constraints to be true at the same time, a logical -and ‘&’, or we can require that either one constraint or -another must be true, a logical or ‘|’. By using these with +and ‘&’, or we can require that either one constraint or +another must be true, a logical or ‘|’. By using these with constraints and parentheses for grouping, arbitrarily complicated patterns can be constructed.

    @@ -337,11 +337,11 @@ requires two or more mask/value style checks to correctly implement.

    -7.4.3. Defining Operands and Invoking Subtables

    +7.4.3. Defining Operands and Invoking Subtables

    The principle way of defining a constructor operand, left undefined from the display section, is done in the bit pattern section. If an -operand’s identifier is used by itself, not as part of a constraint, +operand’s identifier is used by itself, not as part of a constraint, then the operand takes on both the display and semantic definition of the global symbol with the same identifier. The syntax is slightly confusing at first. The identifier must appear in the pattern as if it @@ -390,13 +390,13 @@ parsers, a SLEIGH specification is in part a grammar specification. The terminal symbols, or tokens, are the bits of an instruction, and the constructors and tables are the non-terminating symbols. These all build up to the root instruction table, the -grammar’s start symbol. So this link from local to global is simply a +grammar’s start symbol. So this link from local to global is simply a statement of the grouping of old symbols into the new constructor.

    -7.4.4. Variable Length Instructions

    +7.4.4. Variable Length Instructions

    There are some additional complexities to designing a specification for a processor with variable length instructions. Some initial @@ -419,14 +419,14 @@ designer control over how tokens fit together.

    -7.4.4.1. The ';' Operator
    +7.4.4.1. The ';' Operator

    The most important operator for patterns defining variable length -instructions is the concatenation operator ‘;’. When building a +instructions is the concatenation operator ‘;’. When building a constructor with fields from two or more tokens, the pattern must explicitly define the order of the tokens. In terms of the logic of -the pattern expressions themselves, the ‘;’ operator has the same -meaning as the ‘&’ operator. The combined expression matches only if +the pattern expressions themselves, the ‘;’ operator has the same +meaning as the ‘&’ operator. The combined expression matches only if both subexpressions are true. However, it also requires that the subexpressions involve multiple tokens and explicitly indicates an order for them. @@ -456,7 +456,7 @@ corresponding encoding. The second instruction, add, uses fields op and reg, but it also uses field imm16 contained -in immtoken. The ‘;’ operator indicates that +in immtoken. The ‘;’ operator indicates that token base (via its fields) comes first in the encoding, followed by immtoken. The constraints on base will therefore correspond to constraints @@ -466,25 +466,25 @@ bytes. The length of the final encoding for add< will be 3 bytes, the sum of the lengths of the two tokens.

    -If two pattern expressions are combined with the ‘&’ or ‘|’ operator, -where the concatenation operator ‘;’ is also being used, the designer +If two pattern expressions are combined with the ‘&’ or ‘|’ operator, +where the concatenation operator ‘;’ is also being used, the designer must make sure that the tokens underlying each expression are the same and come in the same order. In the example add -instruction for instance, the ‘&’ operator combines the “op=3” and -“reg” expressions. Both of these expressions involve only the +instruction for instance, the ‘&’ operator combines the “op=3” and +“reg” expressions. Both of these expressions involve only the token base, so the matching requirement is -satisfied. The ‘&’ and ‘|’ operators can combine expressions built out +satisfied. The ‘&’ and ‘|’ operators can combine expressions built out of more than one token, but the tokens must come in the same -order. Also these operators have higher precedence than the ‘;’ +order. Also these operators have higher precedence than the ‘;’ operator, so parentheses may be necessary to get the intended meaning.

    -7.4.4.2. The '...' Operator
    +7.4.4.2. The '...' Operator

    -The ellipsis operator ‘...’ is used to satisfy the token matching -requirements of the ‘&’ and ‘|’ operators (described in the previous +The ellipsis operator ‘...’ is used to satisfy the token matching +requirements of the ‘&’ and ‘|’ operators (described in the previous section), when the operands are of different lengths. The ellipsis is a unary operator applied to a pattern expression that extends its token length before it is combined with another expression. Depending @@ -496,7 +496,7 @@ extension. addrmode: reg is reg & mode=0 { ... addrmode: #imm16 is mode=1; imm16 { ... -:xor “A”,addrmode is op=4 ... & addrmode { ... +:xor “A”,addrmode is op=4 ... & addrmode { ...

    @@ -527,7 +527,7 @@ whatever the length of addrmode turns out

    Since the op constraint occurs to the left of the ellipsis, it is considered left justified, and the matching -requirement for ‘&’ will insist that base is the +requirement for ‘&’ will insist that base is the first token in all forms of addrmode. This allows the xor instruction's constraint on op and the addrmode @@ -538,7 +538,7 @@ constraints on a single byte in the final encoding.

    -7.4.5. Invisible Operands

    +7.4.5. Invisible Operands

    It is not necessary for a global symbol, which is needed by a constructor, to appear in the display section of the definition. If @@ -549,15 +549,15 @@ operand. Such an operand behaves and is parsed exactly like any other operand but there is absolutely no visible indication of the operand in the final display of the assembly instruction. The one common type of instruction that uses this is the relative branch (see -Section 7.5.1, “Relative Branches”) but it is otherwise needed +Section 7.5.1, “Relative Branches”) but it is otherwise needed only in more esoteric instructions. It is useful in situations where you need to break up the parsing of an instruction along lines that -don’t quite match the assembly. +don’t quite match the assembly.

    -7.4.6. Empty Patterns

    +7.4.6. Empty Patterns

    Occasionally there is a need for an empty pattern when building tables. An empty pattern matches everything. There is a predefined @@ -567,9 +567,9 @@ to indicate an empty pattern.

    -7.4.7. Advanced Constraints

    +7.4.7. Advanced Constraints

    -A constraint does not have to be of the form “field = constant”, +A constraint does not have to be of the form “field = constant”, although this is almost always what is needed. In certain situations, it may be more convenient to use a different kind of constraint. Special care should be taken when designing these @@ -584,7 +584,7 @@ of parsing states for a single constraint. A constraint can actually be built out of arbitrary expressions. These pattern expressions are more commonly used in disassembly actions and are defined in -Section 7.5.2, “General Actions and Pattern Expressions”, but they can also be used in +Section 7.5.2, “General Actions and Pattern Expressions”, but they can also be used in constraints. So in general, a constraint is any equation where the left-hand side is a single family symbol, the right-hand side is an arbitrary pattern expression, and the constraint operator is one of @@ -592,7 +592,7 @@ the following:

    -

    Table 3. Constraint Operators

    +

    Table 3. Constraint Operators

    @@ -645,7 +645,7 @@ feature of clr from xor, are equal. The easiest way to specify this special case is with the general constraint, -“r2 = r1”, as in the second +“r2 = r1”, as in the second line of the example. The SLEIGH compiler will implement this by enumerating all the cases where r2 equals r1, creating as many states as there are @@ -655,7 +655,7 @@ registers. But the specification itself, at least, remains compact.

    -7.5. Disassembly Actions Section

    +7.5. Disassembly Actions Section

    After the bit pattern section, there can optionally be a section for doing dynamic calculations, which must be between square brackets. For @@ -671,11 +671,11 @@ time, usually for part of the disassembly display.

    -7.5.1. Relative Branches

    +7.5.1. Relative Branches

    The canonical example of an action at disassembly time is a branch relocation. A jump instruction encodes the address of where it jumps -to as a relative offset to the instruction’s address, for +to as a relative offset to the instruction’s address, for instance. But when we display the assembly, we want to show the absolute address of the jump destination. The correct way to specify this is to reserve an identifier in the display section which @@ -698,7 +698,7 @@ defined in the action section as the integer obtained by adding a multiple of simm8 to inst_next, a symbol predefined to be equal to the address of the following instruction (see -Section 5.2, “Predefined Symbols”). Now reloc +Section 5.2, “Predefined Symbols”). Now reloc is a specific symbol with both semantic and display meaning equal to the desired absolute address. This address is calculated separately, at disassembly time, for every instruction that this constructor @@ -707,7 +707,7 @@ matches.

    -7.5.2. General Actions and Pattern Expressions

    +7.5.2. General Actions and Pattern Expressions

    In general, the disassembly actions are encoded as a sequence of assignments separated by semicolons. The left-hand side of each @@ -719,7 +719,7 @@ is built up out of the following typical operators:

    -

    Table 4. Pattern Expression Operators

    +

    Table 4. Pattern Expression Operators

    @@ -816,7 +816,7 @@ local identifier before it can be used.

    The left-hand side of an assignment statement can be a context -variable (see Section 6.4, “Context Variables”). An +variable (see Section 6.4, “Context Variables”). An assignment to such a variable changes the context in which the current instruction is being disassembled and can potentially have a drastic effect on how the rest of the instruction is disassembled. An @@ -828,26 +828,26 @@ more globalset directives, which cause changes to context variables to become more permanent. This directive is distinct from the operators in a pattern expression and must be invoked as a separate statement. See -Section 8, “Using Context”, for a discussion of how to +Section 8, “Using Context”, for a discussion of how to effectively use context variables and -Section 8.3, “Global Context Change”, for details of +Section 8.3, “Global Context Change”, for details of the globalset directive.

    Note that there are two syntax forms for the logical operators in a pattern expression. When an expression is used as part of a -constraint, the “$and” and “$or” forms of the operators must be used +constraint, the “$and” and “$or” forms of the operators must be used in order to distinguish the bitwise operators from the special pattern -combining operators, ‘&’ and ‘|’ (as described in -Section 7.4.2, “The '&' and '|' Operators”). However inside the square braces -of the disassembly action section, ‘&’ and ‘|’ are interpreted as +combining operators, ‘&’ and ‘|’ (as described in +Section 7.4.2, “The '&' and '|' Operators”). However inside the square braces +of the disassembly action section, ‘&’ and ‘|’ are interpreted as the usual logical operators.

    -7.6. The With Block

    +7.6. The With Block

    To avoid tedious repetition and to ease the maintenance of specifications already having many, many constructors and tables, the with @@ -867,9 +867,9 @@ with op1 : mode=1 [ mode=2; ] { In the example, both constructors are added to the table identified by op1. Both require the context field mode to be equal to 1. The listed constraints take the -form described in Section 7.4, “The Bit Pattern Section”, and they are joined to -those given in the constructor statement as if prepended using ‘&’. Similarly, -the actions take the form described in Section 7.5, “Disassembly Actions Section” +form described in Section 7.4, “The Bit Pattern Section”, and they are joined to +those given in the constructor statement as if prepended using ‘&’. Similarly, +the actions take the form described in Section 7.5, “Disassembly Actions Section” and are prepended to the actions given in the constructor statement. Prepending the actions allows the statement to override actions in the with block. Both technically occur, but only the last one has a noticeable effect. The above @@ -894,12 +894,12 @@ yet exist, the table is created immediately. Inside a with block that has a table header, a nested with block may specify the instruction table by name, as in "with instruction : {...}". Inside such a block, the rule regarding mnemonic literals is restored (see -Section 7.3.1, “Mnemonic”). +Section 7.3.1, “Mnemonic”).

    -7.7. The Semantic Section

    +7.7. The Semantic Section

    The final section of a constructor definition is the semantic section. This is a description of how the processor would manipulate @@ -916,8 +916,8 @@ were varnodes.

    The semantic section for one constructor is surrounded by curly braces -‘{‘ and ‘}’ and consists of zero or more statements separated by -semicolons ‘;’. Most statements are built up out of C-like syntax, +‘{‘ and ‘}’ and consists of zero or more statements separated by +semicolons ‘;’. Most statements are built up out of C-like syntax, where the variables are the symbols visible to the constructor. There is a direct correspondence between each type of operator used in the statements and a p-code operation. The SLEIGH compiler generates @@ -939,29 +939,29 @@ varnode is r1.

    -7.7.1. Expressions

    +7.7.1. Expressions

    Expressions are built out of symbols and the binary and unary -operators listed in Table 5, “Semantic Expression Operators and Syntax” in the +operators listed in Table 5, “Semantic Expression Operators and Syntax” in the Appendix. All expressions evaluate to an integer, floating point, or boolean value, depending on the final operation of the expression. The value is then used depending on the kind of statement. Most of the operators require that their input and output varnodes all be the same -size (see Section 7.7.3, “Varnode Sizes”). The operators all +size (see Section 7.7.3, “Varnode Sizes”). The operators all have a precedence, which is used by the SLEIGH compiler to determine the ordering of the final p-code operations. Parentheses can be used within expressions to affect this order.

    -7.7.1.1. Arithmetic, Logical and Boolean Operators
    +7.7.1.1. Arithmetic, Logical and Boolean Operators

    For the most part these operators should be familiar to software developers. The only real differences arise from the fact that varnodes are typeless. So for instance, there has to be separate -operators to distinguish between dividing unsigned numbers ‘/’, -dividing signed numbers ‘s/’, and dividing floating point numbers -‘f/’. +operators to distinguish between dividing unsigned numbers ‘/’, +dividing signed numbers ‘s/’, and dividing floating point numbers +‘f/’.

    Carry, borrow, and overflow calculations are implemented with separate @@ -974,11 +974,11 @@ some people in this form (see the descriptions in the Appendix).

    -7.7.1.2. The '*' Operator
    +7.7.1.2. The '*' Operator

    The dereference operator, which generates LOAD operations (and STORE operations), has slightly -unfamiliar syntax. The ‘*’ operator, as is usual in many programming +unfamiliar syntax. The ‘*’ operator, as is usual in many programming languages, indicates that the affected variable is a pointer and that the expression is dereferencing the data being pointed to. Unlike most languages, in SLEIGH, it is not immediately @@ -989,16 +989,16 @@ the default space, as labeled in the defi of one of the address spaces with the default attribute. If that is not the space desired, the default can be overridden by putting the -identifier for the space in square brackets immediately after the ‘*’. +identifier for the space in square brackets immediately after the ‘*’.

    It is also frequently not clear what the size of the dereferenced data is because the pointer variable is typeless. The SLEIGH compiler can frequently deduce what the size must be by looking at the operation in the context of the entire statement (see -Section 7.7.3, “Varnode Sizes”). But in some situations, this +Section 7.7.3, “Varnode Sizes”). But in some situations, this may not be possible, so there is a way to specify the size -explicitly. The operator can be followed by a colon ‘:’ and an integer +explicitly. The operator can be followed by a colon ‘:’ and an integer indicating the number of bytes being dereferenced. This can be used with or without the address space override. We give an example of each kind of override in the example below. @@ -1017,7 +1017,7 @@ set to something other than one.

    -7.7.1.3. Extension
    +7.7.1.3. Extension

    Most processors have instructions that extend small values into big values, and many instructions do these minor data manipulations @@ -1039,15 +1039,15 @@ the sext operator.

    -7.7.1.4. Truncation
    +7.7.1.4. Truncation

    There are two forms of syntax indicating a truncation of the input -varnode. In one the varnode is followed by a colon ‘:’ and an integer +varnode. In one the varnode is followed by a colon ‘:’ and an integer indicating the number of bytes to copy into the output, starting with the least significant byte. In the second form, the varnode is followed by an integer, surrounded by parentheses, indicating the number of least significant bytes to truncate from the input. This -second form doesn’t directly specify the size of the output, which +second form doesn’t directly specify the size of the output, which must be inferred from context.

    @@ -1067,7 +1067,7 @@ half and hi receives the most significant
     
    -7.7.1.5. Bit Range Operator
    +7.7.1.5. Bit Range Operator

    A specific subrange of bits within a varnode can be explicitly referenced. Depending on the range, this may amount to just a @@ -1082,7 +1082,7 @@ restricted to byte alignment.

    A varnode, r2 in this example, is immediately -followed by square brackets ‘[’ and ‘]’ indicating a bit range, and +followed by square brackets ‘[’ and ‘]’ indicating a bit range, and within the brackets, there are two parameters separated by a comma. The first parameter is an integer indicating the least significant bit of the resulting bit range. The bits of the varnode @@ -1101,7 +1101,7 @@ with memory down to byte resolution. The bit range operator will generate some combination of INT_RIGHT, INT_AND, and SUBPIECE to simulate the extraction of -smaller or unaligned pieces. The “r2[3,1]” from the example generates +smaller or unaligned pieces. The “r2[3,1]” from the example generates the following p-code, for instance.

    @@ -1119,12 +1119,12 @@ these are automatically set to zero.
     

    This operator can also be used on the left-hand side of assignments -with similar behavior and caveats (see Section 7.7.2.8, “Bit Range Assignments”). +with similar behavior and caveats (see Section 7.7.2.8, “Bit Range Assignments”).

    -7.7.1.6. Address-of Operator
    +7.7.1.6. Address-of Operator

    There is an address-of operator for generating the address offset of a selected varnode as an integer value for use @@ -1143,8 +1143,8 @@ specification may produce unexpected results.

    -There ‘&’ operator in front of a symbol invokes this function. The -ampersand can also be followed by a colon ‘:’ and an integer +There ‘&’ operator in front of a symbol invokes this function. The +ampersand can also be followed by a colon ‘:’ and an integer explicitly indicating the size of the resulting constant as a varnode.

    @@ -1164,12 +1164,12 @@ following r1 is copied
     into r1, even though it is not mentioned directly
     in the instruction. Notice that the address-of operator only produces
     the offset portion of the address, and to copy the desired value, the
    -‘*’ operator must have a register space override.
    +‘*’ operator must have a register space override.
     

    -7.7.1.7. Managed Code Operations
    +7.7.1.7. Managed Code Operations

    SLEIGH provides basic support for instructions where encoding and context don't provide a complete description of the semantics. This is the case @@ -1210,7 +1210,7 @@ of objects to allocate. It returns a pointer to the allocated object.

    -7.7.1.8. User-Defined Operations
    +7.7.1.8. User-Defined Operations

    Any identifier that has been defined as a new p-code operation, using the define pcodeop statement, can be @@ -1231,15 +1231,15 @@ define pcodeop arctan;

    -7.7.2. Statements

    +7.7.2. Statements

    We describe the types of semantic statements that are allowed in SLEIGH.

    -7.7.2.1. Assignment Statements and Temporary Variables
    +7.7.2.1. Assignment Statements and Temporary Variables

    -Of course SLEIGH allows assignment statements with the ‘=’ operator, +Of course SLEIGH allows assignment statements with the ‘=’ operator, where the right-hand side is an arbitrary expression and the left-hand side is the varnode being assigned. The assigned varnode can be any specific symbol in the scope of the constructor, either a global @@ -1256,8 +1256,8 @@ result of the expression. The new symbol becomes part of the local scope of the constructor, and can be referred to in the following semantic statements. The size of the new varnode is calculated by examining the statement in context (see -Section 7.7.3, “Varnode Sizes”). It is also possible to -explicitly indicate the size by using the colon ‘:’ operator followed +Section 7.7.3, “Varnode Sizes”). It is also possible to +explicitly indicate the size by using the colon ‘:’ operator followed by an integer size in bytes. The following examples demonstrate the temporary variable tmp being defined using both forms. @@ -1305,11 +1305,11 @@ and may be enforced in future compiler versions.

    -7.7.2.2. Storage Statements
    +7.7.2.2. Storage Statements

    SLEIGH supports fairly standard storage statement syntax to complement the load operator. The left-hand side of an -assignment statement uses the ‘*’ operator to indicate a dynamic +assignment statement uses the ‘*’ operator to indicate a dynamic storage location, followed by an arbitrary expression to calculate the location. This syntax of course generates the p-code STORE operator as the final step of the @@ -1323,10 +1323,10 @@ statement.

    -The same size and address space considerations that apply to the ‘*’ +The same size and address space considerations that apply to the ‘*’ operator when it is used as a load operator also apply when it is used as a store operator, see -Section 7.7.1.2, “The '*' Operator”. Unless explicit modifiers are +Section 7.7.1.2, “The '*' Operator”. Unless explicit modifiers are given, the default address space is assumed as the storage destination, and the size of the data being stored is calculated from context. Keep in mind that the address represented by the pointer is @@ -1336,9 +1336,9 @@ attribute is set to something other than one.

    -7.7.2.3. Exports
    +7.7.2.3. Exports

    -The semantic section doesn’t just specify how to generate p-code for a +The semantic section doesn’t just specify how to generate p-code for a constructor. Except for those constructors in the root table, this section also associates a semantic meaning to the table symbol the constructor is part of, allowing the table to be used as an operand in @@ -1374,32 +1374,32 @@ a reference to the varnode being exported copy of the value. If the table symbol is written to, as the left-hand side of an assignment statement, in some other constructor, the exported varnode is affected. A constant can be exported if its size -as a varnode is given explicitly with the ‘:’ operator. +as a varnode is given explicitly with the ‘:’ operator.

    It is not legal to put a full expression in an export statement, any expression -must appear in an earlier statement. However, a single ‘&’ +must appear in an earlier statement. However, a single ‘&’ operator is allowed as part of the statement and it behaves as it would in a normal expression (see -Section 7.7.1.6, “Address-of Operator”). It causes the address of the +Section 7.7.1.6, “Address-of Operator”). It causes the address of the varnode being modified to be exported as an integer constant.

    -7.7.2.4. Dynamic References
    +7.7.2.4. Dynamic References

    The only other operator allowed as part of -an export statement, is the ‘*’ +an export statement, is the ‘*’ operator. The semantic meaning of this operator is the same as if it were used in an expression (see -Section 7.7.1.2, “The '*' Operator”), but it is worth examining the +Section 7.7.1.2, “The '*' Operator”), but it is worth examining the effects of this form of export in detail. Bearing in mind that an export statement exports -a reference, using the ‘*’ operator in the +a reference, using the ‘*’ operator in the statement exports a dynamic reference. The -varnode being modified by the ‘*’ is interpreted as a pointer to +varnode being modified by the ‘*’ is interpreted as a pointer to another varnode. It is this varnode being pointed to which is exported, even though the address may be dynamic and cannot be determined at disassembly time. This is not the same as dereferencing @@ -1427,14 +1427,14 @@ calculated from a register reg and a fiel instruction off. The constructor does not export the resulting pointer ea, it exports the location being pointed to by ea. Notice the size of this -location (4) is given explicitly with the ‘:’ modifier. The ‘*’ +location (4) is given explicitly with the ‘:’ modifier. The ‘*’ operator can also be used on constant pointers. In the second example, the constant operand reloc is used as the offset portion of an address into the ram address space. The constant reloc is calculated at disassembly time from the instruction field abs. This is a very common construction for -jump destinations (see Section 7.5.1, “Relative Branches”) but +jump destinations (see Section 7.5.1, “Relative Branches”) but can be used in general. This particular combination of a disassembly time action and a dynamic export is a very general way to construct a family of varnodes. @@ -1447,10 +1447,10 @@ levels.

    -7.7.2.5. Branching Statements
    +7.7.2.5. Branching Statements

    This section discusses statements that generate p-code branching -operations. These are listed in Table 7, “Branching Statements”, in the Appendix. +operations. These are listed in Table 7, “Branching Statements”, in the Appendix.

    There are six forms covering the gamut of typical assembly language @@ -1492,12 +1492,12 @@ destination is the first p-code operation for the (translated) machine instruction at that address. For most cases, this is the only kind of branching needed. The rarer case of p-code relative branching is discussed in the following section -(Section 7.7.2.6, “P-code Relative Branching”), but for the remainder of +(Section 7.7.2.6, “P-code Relative Branching”), but for the remainder of this section, we assume the destination is ultimately given as an address.

    -There are two ways to specify a branching operation’s destination +There are two ways to specify a branching operation’s destination address; directly and indirectly. Where a direct address is needed, as for the BRANCH, CBRANCH, and CALL instructions, The specification can give @@ -1613,7 +1613,7 @@ must match the size of the destination space.

    -7.7.2.6. P-code Relative Branching
    +7.7.2.6. P-code Relative Branching

    In some cases, the semantics of an instruction may require branching within the semantics of a single @@ -1623,7 +1623,7 @@ branching. Individual p-code operations can be identified by a label, and this label can be used as the destination specifier, after the goto keyword. A label, within the semantic section, is -any identifier surrounded by the ‘<’ and ‘>’ characters. If this +any identifier surrounded by the ‘<’ and ‘>’ characters. If this construction occurs at the beginning of a statement, we say the label is defined, and that identifier is now associated with the first p-code operation corresponding to the following @@ -1649,7 +1649,7 @@ defined. Multiple references to the same label are allowed.

    -In the example above, the string “loopstart” is the label identifier +In the example above, the string “loopstart” is the label identifier which appears twice; once at the point where the label is defined at the top of the loop, after the initialization, and once as a reference where the conditional branch is made for the loop. @@ -1665,10 +1665,10 @@ instruction is not possible.

    Internally, branches to labels are encoded as a relative index. Each -p-code operation is assigned an index corresponding to the operation’s +p-code operation is assigned an index corresponding to the operation’s position within the entire translation of the instruction. Then the branch can be expressed as a relative offset between the branch -operation’s index and the destination operation’s index. The SLEIGH +operation’s index and the destination operation’s index. The SLEIGH compiler encodes this offset as a constant varnode that is used as input to the BRANCH, CBRANCH, @@ -1677,7 +1677,7 @@ or CALL operation.

    -7.7.2.7. Skip Instruction Branching
    +7.7.2.7. Skip Instruction Branching

    Many processors have a conditional-skip-instruction which must branch over the next instruction based upon some condition. The inst_next2 symbol has been provided for @@ -1698,15 +1698,15 @@ instruction when the condition is satisfied.

    -7.7.2.8. Bit Range Assignments
    +7.7.2.8. Bit Range Assignments

    The bit range operator can appear on the left-hand side of an -assignment. But as with the ‘*’ operator, its meaning is slightly +assignment. But as with the ‘*’ operator, its meaning is slightly different when used on this side. The bit range is specified in square brackets, as before, by giving the integer specifying the least significant bit of the range, followed by the number of bits in the range. In contrast with its use on the right however (see -Section 7.7.1.5, “Bit Range Operator”), the indicated bit range +Section 7.7.1.5, “Bit Range Operator”), the indicated bit range is filled rather than extracted. Bits obtained from evaluating the expression on the right are extracted and spliced into the result at the indicated bit offset. @@ -1729,7 +1729,7 @@ In terms of the rest of the assignment expression, the bit range operator is again assumed to have a size equal to the minimum number of bytes needed to hold the bit range. In particular, in order to satisfy size restrictions (see -Section 7.7.3, “Varnode Sizes”), the right-hand side must +Section 7.7.3, “Varnode Sizes”), the right-hand side must match this size. Furthermore, it is assumed that any extra bits in the right-hand side expression are already set to zero.

    @@ -1737,7 +1737,7 @@ right-hand side expression are already set to zero.

    -7.7.3. Varnode Sizes

    +7.7.3. Varnode Sizes

    All statements within the semantic section must be specified up to the point where the sizes of all varnodes are unambiguously @@ -1760,7 +1760,7 @@ The SLEIGH compiler does not make assumptions about the size of a constant variable based on the constant value itself. This is true of values occurring explicitly in the specification and of values that are calculated dynamically in a disassembly action. As described in -Section 7.7.2.1, “Assignment Statements and Temporary Variables”, temporary variables do not +Section 7.7.2.1, “Assignment Statements and Temporary Variables”, temporary variables do not need to have their size given explicitly.

    @@ -1768,7 +1768,7 @@ The SLEIGH compiler can usually fill in the required size by examining these situations in the context of the entire semantic section. Most p-code operations have size restrictions on their inputs and outputs, which when put together can uniquely determine the unspecified -sizes. Referring to Table 5, “Semantic Expression Operators and Syntax” in the +sizes. Referring to Table 5, “Semantic Expression Operators and Syntax” in the Appendix, all arithmetic and logical operations, both integer and floating point, must have inputs and outputs all of the same size. The only exceptions are as follows. The overflow @@ -1784,8 +1784,8 @@ a size of 1 byte.

    The operators without a size constraint are the load and store operators, the extension and truncation operators, and the conversion -operators. As discussed in Section 7.7.1.2, “The '*' Operator”, the -‘*’ operator cannot get size information for the dynamic (pointed-to) +operators. As discussed in Section 7.7.1.2, “The '*' Operator”, the +‘*’ operator cannot get size information for the dynamic (pointed-to) object from the pointer itself. The other operators by definition involve a change of size from input to output.

    @@ -1793,7 +1793,7 @@ involve a change of size from input to output. If the SLEIGH compiler cannot discover the sizes of constants and temporaries, it will report an error stating that it could not resolve variable sizes for that constructor. This can usually be fixed rapidly -by appending the size ‘:’ modifier to either the ‘*’ operator, the +by appending the size ‘:’ modifier to either the ‘*’ operator, the temporary variable definition, or to an explicit integer. Here are three examples of statements that generate a size resolution error, each followed by a variation which corrects the error. @@ -1823,7 +1823,7 @@ each followed by a variation which corrects the error.

    -7.7.4. Unimplemented Semantics

    +7.7.4. Unimplemented Semantics

    The semantic section must be present for every constructor in the specification. But the designer can leave the semantics explicitly @@ -1849,7 +1849,7 @@ nothing.

    -7.8. Tables

    +7.8. Tables

    A single constructor does not form a new specific symbol. The table that the constructor is @@ -1874,13 +1874,13 @@ exploit the similarity to produce an extremely concise description.

    -7.8.1. Matching

    +7.8.1. Matching

    If a table contains exactly one constructor, the meaning of the table as a specific symbol is straightforward. The display meaning of the symbol comes from the display section of the -constructor, and the symbol’s semantic meaning comes from the -constructor’s semantic section. +constructor, and the symbol’s semantic meaning comes from the +constructor’s semantic section.

     mode1: (r1) is addrmode=1 & r1 { export r1; }
    @@ -1890,8 +1890,8 @@ mode1: (r1) is addrmode=1 & r1 { export r1; }
     

    The table symbol in this example is mode1. Assuming this is the only constructor, -the display meaning of the symbol are the literal characters ‘(‘, and -‘)’ concatenated with the display meaning of r1, +the display meaning of the symbol are the literal characters ‘(‘, and +‘)’ concatenated with the display meaning of r1, presumably a register name that has been attached. The semantic meaning of mode1, because of the export statement, becomes whatever register is matched by @@ -1938,7 +1938,7 @@ symbol.

     zA: r1  is addrmode=3 & r1   { export r1; }
    -zA: “0” is addrmode=3 & r1=0 { export 0:4; } # Special case
    +zA: “0” is addrmode=3 & r1=0 { export 0:4; } # Special case
     

    @@ -1983,7 +1983,7 @@ should generally be avoided.

    -7.8.2. Specific Symbol Trees

    +7.8.2. Specific Symbol Trees

    When the SLEIGH parser analyzes an instruction, it starts with the root symbol instruction, and decides which of the @@ -1993,7 +1993,7 @@ parsing process recurses at this point. Each of the unresolved family symbols is analyzed in the same way to find the matching specific symbol. The matching is accomplished either with a table lookup, as with a field with attached registers, or with the matching algorithm -described in Section 7.8.1, “Matching”. By the end of the +described in Section 7.8.1, “Matching”. By the end of the parsing process, we have a tree of specific symbols representing the parsed instruction. We present a small but complete SLEIGH specification to illustrate this hierarchy. @@ -2049,10 +2049,10 @@ constructor is resolved in turn.

    Two Encodings and the Resulting Specific Symbol Trees
    -

    Figure 1. Two Encodings and the Resulting Specific Symbol Trees

    +

    Figure 1. Two Encodings and the Resulting Specific Symbol Trees


    -In Figure 1, “Two Encodings and the Resulting Specific Symbol Trees”, we can see the break down +In Figure 1, “Two Encodings and the Resulting Specific Symbol Trees”, we can see the break down of two typical instructions in the example instruction set. For each instruction, we see the how the encodings split into the relevant fields and the resulting tree of specific symbols. Each node in the @@ -2066,7 +2066,7 @@ and p-code for these encodings by walking the trees.

    -7.8.2.1. Disassembly Trees
    +7.8.2.1. Disassembly Trees

    If the nodes of each tree are replaced with the display information of the corresponding specific symbol, we see how the disassembly @@ -2074,12 +2074,12 @@ statement is built.

    Two Disassembly Trees
    -

    Figure 2. Two Disassembly Trees

    +

    Figure 2. Two Disassembly Trees


    -Figure 2, “Two Disassembly Trees”, shows the resulting +Figure 2, “Two Disassembly Trees”, shows the resulting disassembly trees corresponding to the specific symbol trees in -Figure 1, “Two Encodings and the Resulting Specific Symbol Trees”. The display information comes +Figure 1, “Two Encodings and the Resulting Specific Symbol Trees”. The display information comes from constructor display sections, the names of attached registers, or the integer interpretation of fields. The identifiers in a constructor display section serves as placeholders for the subtrees below them. By @@ -2089,7 +2089,7 @@ statements corresponding to the original instruction encodings.

    -7.8.2.2. P-code Trees
    +7.8.2.2. P-code Trees

    A similar procedure produces the resulting p-code translation of the instruction. If each node in the specific symbol tree is replaced with @@ -2097,10 +2097,10 @@ the corresponding p-code, we see how the final translation is built.

    Two P-code Trees
    -

    Figure 3. Two P-code Trees

    +

    Figure 3. Two P-code Trees


    -Figure 3, “Two P-code Trees” lists the final p-code +Figure 3, “Two P-code Trees” lists the final p-code translation for our example instructions and shows the trees from which the translation is derived. Symbol names within the p-code for a particular node, as with the disassembly tree, are placeholders for @@ -2108,7 +2108,7 @@ the subtree below them. The final translation is put together by concatenating the p-code from each node, traversing the nodes in a depth-first order. Thus the p-code of a child tends to come before the p-code of the parent node (but see -Section 7.9, “P-code Macros”). Placeholders are filled in with the +Section 7.9, “P-code Macros”). Placeholders are filled in with the appropriate varnode, as determined by the export statement of the root of the corresponding subtree.

    @@ -2117,11 +2117,11 @@ of the corresponding subtree.

    -7.9. P-code Macros

    +7.9. P-code Macros

    SLEIGH supports a macro facility for encapsulating semantic actions. The syntax, in effect, allows the designer to define p-code -subroutines which can be invoked as part of a constructor’s semantic +subroutines which can be invoked as part of a constructor’s semantic action. The subroutine is expanded automatically at compile time.

    @@ -2131,8 +2131,8 @@ anywhere in the file before its first use. This is followed by the global identifier for the new macro and a parameter list, comma separated and in parentheses. The body of the definition comes next, surrounded by curly braces. The body is a sequence of semantic -statements with the same syntax as a constructor’s semantic -section. The identifiers in the macro’s parameter list are local in +statements with the same syntax as a constructor’s semantic +section. The identifiers in the macro’s parameter list are local in scope. The macro can refer to these and any global specific symbol.

    @@ -2168,7 +2168,7 @@ directive however should not be used in a macro.
     

    -7.10. Build Directives

    +7.10. Build Directives

    Because the nodes of a specific symbol tree are traversed in a depth-first order, the p-code for a child node in general comes before @@ -2180,10 +2180,10 @@ used to affect these issues in the rare cases where it is necessary. The build directive occurs as another form of statement in the semantic section of a constructor. The keyword build is -followed by one of the constructor’s operand identifiers. Then, -instead of filling in the operand’s associated p-code based on an +followed by one of the constructor’s operand identifiers. Then, +instead of filling in the operand’s associated p-code based on an arbitrary traversal of the symbol tree, the directive specifies that -the operand’s p-code must occur at that point in the p-code for the +the operand’s p-code must occur at that point in the p-code for the parent constructor.

    @@ -2199,7 +2199,7 @@ efficient to treat the condition bit which distinguishes the variants as a special operand.

    -cc: “c” is condition=1 { if (flag==1) goto inst_next; }
    +cc: “c” is condition=1 { if (flag==1) goto inst_next; }
     cc:     is condition=0 { }
     
     :and^cc  r1,r2 is opcode=0x67 & cc & r1 & r2 {
    @@ -2210,7 +2210,7 @@ cc:     is condition=0 { }
     

    -In this example, the conditional variant is distinguished by a ‘c’ +In this example, the conditional variant is distinguished by a ‘c’ appended to the assembly mnemonic. The cc operand performs the conditional side-effect, checking a flag in one case, or doing nothing in the other. The two forms of the instruction can now @@ -2223,7 +2223,7 @@ normal action of the instruction.

    -7.11. Delay Slot Directives

    +7.11. Delay Slot Directives

    For processors with a pipe-lined architecture, multiple instructions are typically executing simultaneously. This can lead to processor @@ -2268,7 +2268,7 @@ Because the delayslot directive combines two or more instructions into one, the meaning of the symbols inst_next and inst_next2 become ambiguous. It is not -clear anymore what exactly the “next instruction” is. SLEIGH uses the +clear anymore what exactly the “next instruction” is. SLEIGH uses the following conventions for interpreting an inst_next symbol. If it is used in the semantic section, the symbol refers to the address of the instruction @@ -2289,15 +2289,15 @@ when computing the value of inst_next2. - - + + - + - +
    -Prev   Next +Prev   Next
    6. Tokens and Fields 6. Tokens and Fields  Home 8. Using Context 8. Using Context
    diff --git a/GhidraDocs/languages/html/sleigh_context.html b/GhidraDocs/languages/html/sleigh_context.html index a6da400f94..0239994cde 100644 --- a/GhidraDocs/languages/html/sleigh_context.html +++ b/GhidraDocs/languages/html/sleigh_context.html @@ -1,24 +1,24 @@ - -8. Using Context + +8. Using Context - + - - + +

    -8. Using Context

    +8. Using Context

    For most practical specifications, the disassembly and semantic meaning of an instruction can be determined by looking only at the @@ -77,7 +77,7 @@ necessary.

    SLEIGH solves these problems by introducing context variables. The syntax for defining these symbols was -described in Section 6.4, “Context Variables”. As mentioned +described in Section 6.4, “Context Variables”. As mentioned there, the easiest and most common way to use a context variable is as just another field to use in our bit patterns. It gives us the extra information we need to distinguish between different instructions @@ -85,7 +85,7 @@ whose encodings are otherwise the same.

    -8.1. Basic Use of Context Variables

    +8.1. Basic Use of Context Variables

    Suppose a processor supports the use of two different sets of registers in its main addressing mode, based on the setting of a @@ -149,12 +149,12 @@ although see the following sections.

    -8.2. Local Context Change

    +8.2. Local Context Change

    SLEIGH can make direct modifications to context variables through statements in the disassembly action section of a constructor. The left-hand side of an assignment statement in this section can be a context variable, -see Section 7.5.2, “General Actions and Pattern Expressions”. Because the result of this +see Section 7.5.2, “General Actions and Pattern Expressions”. Because the result of this assignment is calculated in the middle of the instruction disassembly, the change in value of the context variable can potentially affect any remaining parsing for that instruction. A modal variable is being @@ -193,7 +193,7 @@ use mode, its value will have reverted to original global state. The same holds for any context variable modified with this syntax. If an instruction needs to permanently modify the state of a context variable, the designer must use -constructions described in Section 8.3, “Global Context Change”. +constructions described in Section 8.3, “Global Context Change”.

    Clearly, the behavior of the above example could be easily replicated @@ -219,7 +219,7 @@ by build directives.

    -8.3. Global Context Change

    +8.3. Global Context Change

    It is possible for an instruction to attempt a permanent change to a context variable, which would then affect the parsing of other @@ -261,7 +261,7 @@ select r registers via rreg1, and smode sets mode to 1 in order to select s registers. As is described in -Section 8.2, “Local Context Change”, these assignments by themselves +Section 8.2, “Local Context Change”, these assignments by themselves cause only a local context change. However, the subsequent globalset directives make the change persist outside of the the instructions @@ -276,7 +276,7 @@ of mode begins at the next address.

    -8.3.1. Context Flow

    +8.3.1. Context Flow

    A global change to context that affects instruction decoding is typically open-ended. I.e. once the mode switching instruction is executed, a permanent change @@ -290,7 +290,7 @@ is encountered.

    Flow following behavior can be overridden by adding the noflow -attribute to the definition of the context field. (See Section 6.4, “Context Variables”) +attribute to the definition of the context field. (See Section 6.4, “Context Variables”) In this case, a globalset directive only affects the context of a single instruction at the specified address. Subsequent instructions retain their original context. This can be useful in a variety of situations but is typically @@ -348,15 +348,15 @@ end and what to do if there are conflicts. - - + + - + - +
    -Prev   Next +Prev   Next
    7. Constructors 7. Constructors  Home 9. P-code Tables 9. P-code Tables
    diff --git a/GhidraDocs/languages/html/sleigh_definitions.html b/GhidraDocs/languages/html/sleigh_definitions.html index 43a2179522..58fecfb631 100644 --- a/GhidraDocs/languages/html/sleigh_definitions.html +++ b/GhidraDocs/languages/html/sleigh_definitions.html @@ -1,24 +1,24 @@ - -4. Basic Definitions + +4. Basic Definitions - + - - + +

    -4. Basic Definitions

    +4. Basic Definitions

    SLEIGH files must start with all the definitions needed by the rest of the specification. All definition statements start with the keyword -define and end with a semicolon ‘;’. +define and end with a semicolon ‘;’.

    -4.1. Endianess Definition

    +4.1. Endianess Definition

    The first definition in any SLEIGH specification must be for endianess. Either

    @@ -46,7 +46,7 @@ define endian=little; This defines how the processor interprets contiguous sequences of bytes as integers or other values and globally affects values across all address spaces. It also affects how integer fields -within an instruction are interpreted, (see Section 6.1, “Defining Tokens and Fields”), +within an instruction are interpreted, (see Section 6.1, “Defining Tokens and Fields”), although it is possible to override this setting in the rare case that endianess is different for data versus instruction encoding. The specification designer generally only needs to worry about @@ -56,7 +56,7 @@ otherwise the specification language hides endianess issues.

    -4.2. Alignment Definition

    +4.2. Alignment Definition

    An alignment definition looks like

    @@ -73,7 +73,7 @@ instruction as an error.

    -4.3. Space Definitions

    +4.3. Space Definitions

    The definition of an address space looks like

    @@ -115,7 +115,7 @@ and store from dynamic pointers into the space.

    A space of type register_space is -intended to model the processor’s general-purpose registers. In terms +intended to model the processor’s general-purpose registers. In terms of accessing and manipulating data within the space, SLEIGH and p-code make no distinction between the type ram_space or the @@ -157,8 +157,8 @@ At least one space needs to be labeled with the default attribute. This should be the space that the processor accesses with its main address bus. In terms of the rest of the specification file, this sets the default -space referred to by the ‘*’ operator (see -Section 7.7.1.2, “The '*' Operator”). It also has meaning to +space referred to by the ‘*’ operator (see +Section 7.7.1.2, “The '*' Operator”). It also has meaning to GHIDRA.

    @@ -184,7 +184,7 @@ bits).

    -4.4. Naming Registers

    +4.4. Naming Registers

    The general purpose registers of the processors can be named with the following define syntax: @@ -194,8 +194,8 @@ define spacename offset=stringlist is either a single string or a white -space separated list of strings in square brackets ‘[’ and ‘]’. A -string of just “_” indicates a skip in the sequence for that +space separated list of strings in square brackets ‘[’ and ‘]’. A +string of just “_” indicates a skip in the sequence for that definition. The offset corresponding to that position in the list of names will not have a varnode defined at it.

    @@ -228,7 +228,7 @@ define register offset=0 size=1

    -4.5. Bit Range Registers

    +4.5. Bit Range Registers

    Many processors define registers that either consist of a single bit or otherwise don't use an integral number of bytes. A recurring @@ -245,7 +245,7 @@ models because the smallest object they can manipulate directly is a byte. In order to manipulate single bits, p-code must use a combination of bitwise logical, extension, and truncation operations. So a register defined as a bit range is not really a -varnode as described in Section 1.2, “Varnodes”, but is +varnode as described in Section 1.2, “Varnodes”, but is really just a signal to the SLEIGH compiler to fill in the proper operators to simulate the bit manipulation. Using this feature may greatly increase the complexity of the compiled specification with @@ -265,7 +265,7 @@ register. In this example, statusreg is d first as a 4 byte register, and the bit registers themselves are built by the following define bitrange statement. A single bit register definition consists of an identifier -for the register, followed by ‘=’, then the name of the register +for the register, followed by ‘=’, then the name of the register containing the bits, and finally a pair of numbers in square brackets. The first number indicates the lowest significant bit in the containing register of the bit range, where bit 0 is the least @@ -282,11 +282,11 @@ bit of statusreg respectively.

    The syntax for defining a new bit register is consistent with the pseudo bit range operator, described in -Section 7.7.1.5, “Bit Range Operator”, and the resulting symbol +Section 7.7.1.5, “Bit Range Operator”, and the resulting symbol is really just a placeholder for this operator. Whenever SLEIGH sees this symbol it generates p-code precisely as if the designer had used the bit range operator -instead. Section 7.7.1.5, “Bit Range Operator”, provides some +instead. Section 7.7.1.5, “Bit Range Operator”, provides some additional details about how p-code is generated, which apply to the use of bit range registers.

    @@ -299,14 +299,14 @@ used as an alternate syntax for defining overlapping registers.

    -4.6. User-Defined Operations

    +4.6. User-Defined Operations

    The specification designer can define new p-code operations using a define pcodeop statement. This statement automatically reserves an internal form for the new p-code operation and associates an identifier with it. This identifier can then be used in semantic expressions (see -Section 7.7.1.8, “User-Defined Operations”). The following example defines a +Section 7.7.1.8, “User-Defined Operations”). The following example defines a new p-code operation arctan.

    @@ -338,15 +338,15 @@ actions that are too esoteric or too complicated to implement.
     
    -
    -
    +
    +
    -
    +
    -
    +
    -Prev   Next +Prev   Next
    3. Preprocessing 3. Preprocessing  Home 5. Introduction to Symbols 5. Introduction to Symbols
    diff --git a/GhidraDocs/languages/html/sleigh_layout.html b/GhidraDocs/languages/html/sleigh_layout.html index 878e71be69..2a1497c89f 100644 --- a/GhidraDocs/languages/html/sleigh_layout.html +++ b/GhidraDocs/languages/html/sleigh_layout.html @@ -1,24 +1,24 @@ - -2. Basic Specification Layout + +2. Basic Specification Layout - + - +

    -2. Basic Specification Layout

    +2. Basic Specification Layout

    A SLEIGH specification is typically contained in a single file, -although see Section 3.1, “Including Files”. The file must +although see Section 3.1, “Including Files”. The file must follow a specific format as parsed by the SLEIGH compiler. In this section, we list the basic formatting rules for this file as enforced by the compiler.

    -2.1. Comments

    +2.1. Comments

    -Comments start with the ‘#’ character and continue to the end of the +Comments start with the ‘#’ character and continue to the end of the line. Comments can appear anywhere except the display section of a -constructor (see Section 7.3, “The Display Section”) where the ‘#’ character will be +constructor (see Section 7.3, “The Display Section”) where the ‘#’ character will be interpreted as something that should be printed in disassembly.

    -2.2. Identifiers

    +2.2. Identifiers

    Identifiers are made up of letters a-z, capitals A-Z, digits 0-9 and -the characters ‘.’ and ‘_’. An identifier can use these characters in +the characters ‘.’ and ‘_’. An identifier can use these characters in any order and for any length, but it must not start with a digit.

    -2.3. Strings

    +2.3. Strings

    String literals can be used, when specifying names and when specifying how disassembly should be printed, so that special characters are treated as literals. Strings are surrounded by the double quote -character ‘”’ and all characters in between lose their special +character ‘”’ and all characters in between lose their special meaning.

    -2.4. Integers

    +2.4. Integers

    Integers are specified either in a decimal format or in a standard C-style hexadecimal format by prepending the -number with “0x”. Alternately, a binary representation of an integer +number with “0x”. Alternately, a binary representation of an integer can be given by prepending the string of '0' and '1' characters with "0b".

    @@ -82,21 +82,21 @@ can be given by prepending the string of '0' and '1' characters with "0b".
     

    Numbers are treated as unsigned except when used in patterns where they are treated as signed (see -Section 7.4, “The Bit Pattern Section”). The number of bytes used to +Section 7.4, “The Bit Pattern Section”). The number of bytes used to encode the integer when specifying the semantics of an instruction is inferred from other parts of the syntax (see -Section 7.3, “The Display Section”). Otherwise, integers should +Section 7.3, “The Display Section”). Otherwise, integers should be thought of as having arbitrary precision. Currently, SLEIGH stores integers internally with 64 bits of precision.

    -2.5. White Space

    +2.5. White Space

    White space characters include space, tab, line-feed, vertical -line-feed, and carriage-return (‘ ‘, ‘\t’, ‘\r’, ‘\v’, -‘\n’). Variations in spacing have no effect on the parsing of the file +line-feed, and carriage-return (‘ ‘, ‘\t’, ‘\r’, ‘\v’, +‘\n’). Variations in spacing have no effect on the parsing of the file except in string literals.

    @@ -106,15 +106,15 @@ except in string literals. - - + + - + - +
    -Prev   Next +Prev   Next
    SLEIGH SLEIGH  Home 3. Preprocessing 3. Preprocessing
    diff --git a/GhidraDocs/languages/html/sleigh_preprocessing.html b/GhidraDocs/languages/html/sleigh_preprocessing.html index 7f38480482..3006f08f6e 100644 --- a/GhidraDocs/languages/html/sleigh_preprocessing.html +++ b/GhidraDocs/languages/html/sleigh_preprocessing.html @@ -1,24 +1,24 @@ - -3. Preprocessing + +3. Preprocessing - + - - + +

    -3. Preprocessing

    +3. Preprocessing

    SLEIGH provides support for simple file inclusion, macros, and other basic preprocessing functions. These are all invoked with directives -that start with the ‘@’ character, which must be the first character +that start with the ‘@’ character, which must be the first character in the line.

    -3.1. Including Files

    +3.1. Including Files

    In general a single SLEIGH specification is contained in a single file, and the compiler is invoked on one file at a time. Multiple @@ -54,7 +54,7 @@ own @include directives.

    -3.2. Preprocessor Macros

    +3.2. Preprocessor Macros

    SLEIGH allows simple (unparameterized) macro definitions and expansions. A macro definition occurs on one line and starts with @@ -62,7 +62,7 @@ the @define directive. This is followed by an identifier for the macro and then a string to which the macro should expand. The string must either be a proper identifier itself or surrounded with double quotes. The macro can then be -expanded with typical “$(identifier)” syntax at any other point in the +expanded with typical “$(identifier)” syntax at any other point in the specification following the definition.

    @@ -72,9 +72,9 @@ define endian=$(ENDIAN);
     

    This example defines a macro identified as ENDIAN -with the string “big”, and then expands the macro in a later SLEIGH +with the string “big”, and then expands the macro in a later SLEIGH statement. Macro definitions can also be made from the command line -and in the “.spec” file, allowing multiple specification variations to +and in the “.spec” file, allowing multiple specification variations to be derived from one file. SLEIGH also has an @undef directive which removes the definition of a macro from that point on in the file. @@ -85,7 +85,7 @@ definition of a macro from that point on in the file.

    -3.3. Conditional Compilation

    +3.3. Conditional Compilation

    SLEIGH supports several directives that allow conditional inclusion of parts of a specification, based on the existence of a macro, or its @@ -103,7 +103,7 @@ and @endif.

    -3.3.1. @ifdef and @ifndef

    +3.3.1. @ifdef and @ifndef

    The @ifdef directive is followed by a macro identifier and evaluates to true if the macro is defined. @@ -129,14 +129,14 @@ or @elif directive (See below).

    -3.3.2. @if

    +3.3.2. @if

    The @if directive is followed by a boolean expression with macros as the variables and strings as the constants. Comparisons between macros and strings are currently limited to string equality or inequality. But individual comparisons can be combined arbitrarily using parentheses and the boolean -operators ‘&&’, ‘||’, and ‘^^’. These represent a logical +operators ‘&&’, ‘||’, and ‘^^’. These represent a logical and, a logical or, and a logical exclusive-or operation respectively. It is possible to test whether a particular macro is defined within the @@ -158,7 +158,7 @@ is defined.

    -3.3.3. @else and @elif

    +3.3.3. @else and @elif

    An @else directive splits the lines bounded by an @if directive and @@ -180,12 +180,12 @@ one @else, which must occur after all the @elif directives.

    -@if PROCESSOR == “mips”
    -@ define ENDIAN “big”
    -@elif ((PROCESSOR==”x86”)&&(OS!=”win”))
    -@ define ENDIAN “little”
    +@if PROCESSOR == “mips”
    +@ define ENDIAN “big”
    +@elif ((PROCESSOR==”x86”)&&(OS!=”win”))
    +@ define ENDIAN “little”
     @else
    -@ define ENDIAN “unknown”
    +@ define ENDIAN “unknown”
     @endif
     

    @@ -198,15 +198,15 @@ the @elif directives. - - + + - + - +
    -Prev   Next +Prev   Next
    2. Basic Specification Layout 2. Basic Specification Layout  Home 4. Basic Definitions 4. Basic Definitions
    diff --git a/GhidraDocs/languages/html/sleigh_ref.html b/GhidraDocs/languages/html/sleigh_ref.html index d5e608dccd..b21fc5f37d 100644 --- a/GhidraDocs/languages/html/sleigh_ref.html +++ b/GhidraDocs/languages/html/sleigh_ref.html @@ -1,34 +1,34 @@ - -9. P-code Tables + +9. P-code Tables - + - +

    -9. P-code Tables

    +9. P-code Tables

    We list all the p-code operations by name along with the syntax for invoking them within the semantic section of a constructor definition -(see Section 7.7, “The Semantic Section”), and with a +(see Section 7.7, “The Semantic Section”), and with a description of the operator. The terms v0 and v1 represent identifiers of individual input varnodes to the operation. In terms of syntax, v0 @@ -46,7 +46,7 @@ to lowest.

    -

    Table 5. Semantic Expression Operators and Syntax

    +

    Table 5. Semantic Expression Operators and Syntax

    @@ -459,7 +459,7 @@ The following table lists the basic forms of a semantic statement.

    -

    Table 6. Basic Statements and Associated Operators

    +

    Table 6. Basic Statements and Associated Operators

    @@ -538,7 +538,7 @@ The following table lists the branching operations and the statements which invo

    -

    Table 7. Branching Statements

    +

    Table 7. Branching Statements

    @@ -592,14 +592,14 @@ The following table lists the branching operations and the statements which invo
    - - +Prev  + + - + - +
    -Prev   Â Â 
    8. Using Context 8. Using Context  Home Â 
    diff --git a/GhidraDocs/languages/html/sleigh_symbols.html b/GhidraDocs/languages/html/sleigh_symbols.html index df994eb7bd..2022555fb4 100644 --- a/GhidraDocs/languages/html/sleigh_symbols.html +++ b/GhidraDocs/languages/html/sleigh_symbols.html @@ -1,24 +1,24 @@ - -5. Introduction to Symbols + +5. Introduction to Symbols - + - - + +

    -5. Introduction to Symbols

    +5. Introduction to Symbols

    After the definition section, we are prepared to start writing the body of the specification. This part of the specification shows how @@ -61,7 +61,7 @@ Formally a Specific Symbol is defined as

    The named registers that we defined earlier are the simplest examples of specific symbols (see -Section 4.4, “Naming Registers”). The symbol identifier +Section 4.4, “Naming Registers”). The symbol identifier itself is the string that will get printed in disassembly and the varnode associated with the symbol is the one constructed by the define statement. @@ -79,7 +79,7 @@ instructions to specific symbols.

    The set of instruction encodings that map to a single specific symbol is called an instruction pattern and is described -more fully in Section 7.4, “The Bit Pattern Section”. In most cases, this +more fully in Section 7.4, “The Bit Pattern Section”. In most cases, this can be thought of as a mask on the bits of the instruction and a value that the remaining unmasked bits must match. At any rate, the family symbol identifier, when taken out of context, represents the entire @@ -98,14 +98,14 @@ that simulate the instruction.

    The symbol responsible for combining smaller family symbols is called a table, which is fully described in -Section 7.8, “Tables”. Any table symbol +Section 7.8, “Tables”. Any table symbol can be used in the definition of other table symbols until the root symbol is fully described. The root symbol has the predefined identifier instruction.

    -5.1. Notes on Namespaces

    +5.1. Notes on Namespaces

    Almost all identifiers live in the same global "scope". The global scope includes

    @@ -126,15 +126,15 @@ Almost all identifiers live in the same global "scope". The global scope include Names of registers
  • - Names of macros (see Section 7.9, “P-code Macros”) + Names of macros (see Section 7.9, “P-code Macros”)
  • - Names of tables (see Section 7.8, “Tables”) + Names of tables (see Section 7.8, “Tables”)
  • All of the names in this scope must be unique. Each -individual constructor (defined in Section 7, “Constructors”) +individual constructor (defined in Section 7, “Constructors”) defines a local scope for operand names. As with most languages, a local symbol with the same name as a global symbol hides the global symbol while that scope @@ -143,13 +143,13 @@ is in effect.

    -5.2. Predefined Symbols

    +5.2. Predefined Symbols

    We list all of the symbols that are predefined by SLEIGH.

    -

    Table 2. Predefined Symbols

    +

    Table 2. Predefined Symbols

    @@ -213,15 +213,15 @@ is the root instruction table.
    - - + + - + - +
    -Prev   Next +Prev   Next
    4. Basic Definitions 4. Basic Definitions  Home 6. Tokens and Fields 6. Tokens and Fields
    diff --git a/GhidraDocs/languages/html/sleigh_tokens.html b/GhidraDocs/languages/html/sleigh_tokens.html index 2239604d1b..90f10cfc67 100644 --- a/GhidraDocs/languages/html/sleigh_tokens.html +++ b/GhidraDocs/languages/html/sleigh_tokens.html @@ -1,24 +1,24 @@ - -6. Tokens and Fields + +6. Tokens and Fields - + - - + +

    -6. Tokens and Fields

    +6. Tokens and Fields

    -6.1. Defining Tokens and Fields

    +6.1. Defining Tokens and Fields

    A token is one of the byte-sized pieces that make up the machine code instructions being modeled. @@ -57,7 +57,7 @@ field and the range of bits within the token making up the field. The size of a field does not need to be a multiple of 8. The range is inclusive where the least significant bit in the token is labeled 0. When defining tokens that are bigger than 1 byte, the -global endianess setting (See Section 4.1, “Endianess Definition”) +global endianess setting (See Section 4.1, “Endianess Definition”) will affect this labeling. Although it is rarely required, it is possible to override the global endianess setting for a specific token by appending either the qualifier endian=little or endian=big @@ -88,11 +88,11 @@ different names.

    -6.2. Fields as Family Symbols

    +6.2. Fields as Family Symbols

    Fields are the most basic form of family symbol; they define a natural map from instruction bits to a specific symbol as follows. We take the -set of bits within the instruction as given by the field’s defining +set of bits within the instruction as given by the field’s defining range and treat them as an integer encoding. The resulting integer is both the display portion and the semantic meaning of the specific symbol. The display string is obtained by converting the integer into @@ -113,7 +113,7 @@ the dec attribute is not supported]

    -6.3. Attaching Alternate Meanings to Fields

    +6.3. Attaching Alternate Meanings to Fields

    The default interpretation of a field is probably the most natural but of course processors interpret fields within an instruction in a wide @@ -124,7 +124,7 @@ interpretations must be built up out of tables.

    -6.3.1. Attaching Registers

    +6.3.1. Attaching Registers

    Probably the most common processor interpretation of a field is as an encoding of a particular register. In SLEIGH this @@ -140,7 +140,7 @@ space separated list of field identifiers surrounded by square brackets. A registerlist must be a square bracket surrounded and space separated list of register identifiers as created with define statements (see Section -Section 4.4, “Naming Registers”). For each field in +Section 4.4, “Naming Registers”). For each field in the fieldlist, instead of having the display and semantic meaning of an integer, the field becomes a look-up table for the given list of registers. The original integer interpretation is @@ -152,7 +152,7 @@ display and semantic meaning of the field are now taken from the new register.

    -A particular integer can remain unspecified by putting a ‘_’ character +A particular integer can remain unspecified by putting a ‘_’ character in the appropriate position of the register list or also if the length of the register list is less than the integer. A specific integer encoding of the field that is unspecified like this @@ -163,7 +163,7 @@ of the instruction.

    -6.3.2. Attaching Other Integers

    +6.3.2. Attaching Other Integers

    Sometimes a processor interprets a field as an integer but not the integer given by the default interpretation. A different integer @@ -180,12 +180,12 @@ register interpretation is assigned to fields with an attach variables statement, the integers in the list are assigned to each field specified in the fieldlist. [Currently SLEIGH does not support -unspecified positions in the list using a ‘_’] +unspecified positions in the list using a ‘_’]

    -6.3.3. Attaching Names

    +6.3.3. Attaching Names

    It is possible to just modify the display characteristics of a field without changing the semantic meaning. The need for this is rare, but @@ -196,7 +196,7 @@ appropriate to define overlapping fields, one of which is defined to have no semantic meaning. The most convenient way to break down the required disassembly may not be the most convenient way to break down the semantics. It is also possible to have symbols with semantic -meaning but no display meaning (see Section 7.4.5, “Invisible Operands”). +meaning but no display meaning (see Section 7.4.5, “Invisible Operands”).

    At any rate we can list the display interpretation of a field directly @@ -218,7 +218,7 @@ encodings.

    -6.4. Context Variables

    +6.4. Context Variables

    SLEIGH supports the concept of context variables. For the most part processor instructions can be @@ -254,12 +254,12 @@ By default, globally setting a context variable affects instruction decoding from the point of the change, forward, following the flow of the instructions, but if the variable is labeled as noflow, any change is limited to a -single instruction. (See Section 8.3.1, “Context Flow”) +single instruction. (See Section 8.3.1, “Context Flow”)

    Once the context variable is defined, in terms of the specification syntax, it can be treated as if it were just another field. See -Section 8, “Using Context”, for a complete discussion of how to +Section 8, “Using Context”, for a complete discussion of how to use context variables.

    @@ -269,15 +269,15 @@ use context variables. - - + + - + - +
    -Prev   Next +Prev   Next
    5. Introduction to Symbols 5. Introduction to Symbols  Home 7. Constructors 7. Constructors