ContentsIntroductionDefine Section Manager Section Machine Section Function Section Operation Section Action Ordering Data Types Basic Operators Modifiers OSM Actions Annotation Syntax |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IntroductionThe Mescal Architecture Description Language (MADL) specifies the operation state machine model (OSM) for microprocessor modeling purposes. It supplies processor information to software tools including the instruction set simulator, the microarchitecture simulator, the assembler, the disassembler and various compiler optimizers. MADL is composed of two parts: the core language and the annotation language. The core language describes the operation state machines of the OSM model, which have concrete executable semantics. The annotation language describes tool-dependent information. For any tool that utilizes MADL, an annotation description scheme can be created based on the generic annotation syntax. The annotation description supplements the core description with implementation-dependent or tool-dependent information, e.g. hints for the tool to analyze the core description. This document mainly describes the syntax of the core language. The generic syntax of the annotation language is described in Annotation Syntax. Note that currently the hardware layer of the OSM model is not part of MADL. The execution model of the hardware units, including the token managers, are implemented in the general purpose programming language C++, which is the target language into which MADL descriptions are to be translated for execution. MADL only declares the names and the types of the token managers. Description of the hardware layer is expected to be included in future versions of MADL. MADL utilizes a hierarchical description structure called the and-or graph to minize redundancy in descriptions. To integrate the OSM model with the and-or graph, MADL uses a dynamic version of the OSM model. The feature of the dynamic model is that the actions and computations are dynamically bound to the edges of the state diagram. A well-defined dynamic model can be transformed back to a static model. Besides token managers, the entities in the dynamic OSM model include the skeleton and the syntax operation. A skeleton refers to the state diagram and the internal state variables associated with it. A syntax operation refers to a set of actions and computations, as well as assembly syntax and binary encoding. The syntax operations form an and-or graph. A skeleton and all syntax operations in an expansion of the and-or graph constitute the model of one operation in the instruction set. These syntax operations are dynamically bound to the skeleton during execution. An MADL file may contain any number of the following sections,
Besides, an MADL file may also contain the following commands.
Except for the using command, all other sections in MADL are order-independent. For instance, a function section may appear anywhere in an MADL description, either before or after its caller(s). Comments can be placed anywhere in a MADL description. Two types of comments are allowed: single-line comment and block comment. A single-line comment starts with a '#' and lasts until the end of the line. A block comment starts with a "##" and lasts until another "##". |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Define SectionThe define section declares a list of global constant variables and function prototypes. These variables/functions are in the global scope and can be accessed throughout an MADL description. The general syntax of a variable/function declaration is: define_section ::= "DEFINE" def_clause+ def_clause ::= identifier ':' data_type '=' data_value ';' def_clause ::= identifier ':' data_type ';' def_clause ::= identifier ':' func_type ';' An example define section is as follows. DEFINE reg_names : string[16] = {"r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "sl", "fp", "ip", "sp", "lr", "pc"}; pred_table : uint<16>[4] = {0xf0f0, 0x0f0f, 0xcccc, 0x3333}; epsilon : double; func1 : (string, uint<32>); func2 : (uint<32>*, uint<32>); The above define section defines an array of string literals named "reg_name", an array of 16-bit unsigned integer constants "pred_table", and a double-precision constant "epsilon" whose value is not given. Additionally, it defines two functions "func1" and "func2". Function arguments in MADL are passed by reference. Writable arguments are denoted by a "*" after the argument type, e.g. the first argument of "func2". The value of a writable argument may be changed by the function. The variables without values and the functions should be defined in external C++ files for simulation purposes. These C++ files should be linked with MADL generated C++ files in simulators. For syntax of data and function types, refer to the Data Type section of the document. The restriction is that no void or tuple data types can be used in define sections. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Manager SectionIn the OSM model, data or structural resources are modeled as tokens and are managed by token managers. A state machine transacts tokens with the token managers during its execution. In order to get a token, it will typically present to the token manager an index as a token identifier. The manager will then return a token if it is available. The state machine may also read value from and write value to the tokens that it can access. For a list of possible token transactions (also called actions), refer to the OSM Action section of the document. A manager section may contain a CLASS subsection and an INSTANCE subsection. The former declares token manager class names and their types, while the latter declares token manager instances. A type here is a tuple of the token index type and the token value type. All data types except array can be used as index or value type. The syntax of the section is shown below followed by one example. manager_section ::= "MANAGER" class_subsection instance_subsection class_subsection ::= "CLASS" class_clause+ class_clause ::= identifier ':' data_type "->" data_type ';' instance_subsection ::= "INSTANCE" instance_clause+ instance_clause ::= identifier ':' identifier ';' MANAGER CLASS fetch_manager : void -> (uint<32>,uint<32>); simple_manager: void -> void; INSTANCE mIF : fetch_manager; mEX : simple_manager; This example declares a token manager class named "fetch_manager" with a void index type (in this case there is no need for token identifier since this manager has only one token), and a tuple value type. The example also declares a "simple_manager" class with a void index type and a void value type (it is simply a structural resource and has no value). Two token manager instances are later declared based on these two classes in the INSTANCE subsection. An MADL description may contain one or more manager sections. All manager classes and instances declared in these sections are visible to the global scope. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Machine SectionA machine section describes a skeleton, which contains the state diagram and the variables visible to all syntax operations associated with it. A special type of variable is the token buffer. It is used to store allocated tokens for the convenience of reference. A machine section may contain the following subsections:
There must be one and only one INITIAL state defined in each machine section. There can exist any number of regular states as long as there is no naming conflict. The syntax of the machine section is shown below. machine_section ::= "MACHINE" initial_subsection (state_subsection | edge_subsection)+ buffer_subsection var_subsection initial_subsection ::= "INITIAL" identifier ';' state_subsection ::= "STATE" identifier_list ';' identifier_list ::= (identifier ',')* identifier edge_subsection ::= "EDGE" edge_clause+ edge_clause ::= identifier ':' identifier "->" identifier ';' buffer_subsection ::= "BUFFER" buffer_clause+ buffer_clause ::= identifier ':' identifier; var_subsection ::= "VAR" var_clause+ var_clause ::= identifier ':' basic_type ';' The STATE subsection contains a list of state names separated by commas. The EDGE subsection contains a list of edge clauses. Each clause contains the edge name, followed by a ':', the source state name, '->' and the destination state name. The BUFFER subsection contains a list of token buffer clauses, each of which contains a buffer name, followed by ':' and the name of a token manager class. The buffer can only be used to temporarily store tokens obtained from managers of the same class. The variable subsection contains a list of variable declaration, each of which contains a variable name followed by ':' and a type. See Data Type section for details about variable types. An example machine section name "normal" is shown below. MACHINE normal INITIAL S_INIT; STATE S_IF, S_EX; EDGE e_in_if : S_INIT -> S_IF; e_if_ex : S_IF -> S_EX; e_ex_in : S_EX -> S_INIT; BUFFER if_buffer : fetch_manager; ex_buffer : simple_manager; VAR iw : uint<32>; pc : uint<32>; The states and edges forms the state diagram of the skeleton. The state diagram must be a strongly connected directed graph. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Function SectionA function section defines an internal MADL function. This is different from the external functions in the DEFINE section. The body of an internal function is part of the MADL description, while the body of the external functions are in external C++ source files. A function section contains a function name, a list of arguments, an optional variable (VAR) subsection and an evaluation (EVAL) subsection. The variable subsection defines the local variables. Its syntax is the same as the variable subsection of the MACHINE section. The evaluation subsection contains a sequential list of statements. See Operator section for information about the statements. The statements may access the arguments, the local variables and global constant variables from define sections. Unlike C functions, MADL functions do not have a return value. The computation result of the function can be returned through writable arguments. See Data Type section for more information about writable arguments.The syntax of the function section is shown below. function_section ::= "FUNCTION" identifier '(' arg_list ')' var_subsection? eval_subsection arg_list ::= (arg ',')* arg arg ::= identifier ':' basic_type '*'? eval_subsection ::= "EVAL" eval_clause+ eval_clause ::= statement ';' An example function section is given below. The "result" argument is writable and is used to return the value of computation. FUNCTION eval_pred(result:uint<1>*, cond:uint<2>, flags:uint<4>) VAR temp : uint<4>; EVAL temp = pred_table[cond] >> flags; result = (uint<1>)temp; Similar to external functions, internal functions are visible to the global name scope. A function can be called throughout an MADL description, regardless of the location of the caller. Recursion is allowed. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Operation SectionAn operation section defines a syntax operation. It must be defined based on a skeleton. The skeleton for a syntax operation is specified by the "USING" command. The subsections in an operation section may access the local variables declared in the skeleton and the global constant variables in the define sections. An operation section contains a name and the following subsections.
An operation example named "mvn" is shown below. OPERATION mvn VAR v_rs : uint<32>; v_rn : uint<32>; SYNTAX "mvn" reg_names[rd] "," reg_names[rs]; CODING 10111 rd rs ----; TRANS e_id_ex: {v_rs = *mRF[rs], ex_buf = mEX[], !id_buf, *mRF[rs] = v_rd}; v_rd = -v_rs; e_ex_bf: {bf_buf = mBF[], !ex_buf}; |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Action OrderingAn OSM is formed by one skeleton and one or more syntax operations. The skeleton mainly specifies the state diagram while the syntax operations specify the actions and computations occurring on the edges. It is possible that more than one syntax operation annotates its actions and statements, onto the same edge of the skeleton. By OSM rules, when an edge is evaluated, the OSM will first test if all actions on the edge can be fired. If and only if all actions are firable, the OSM will fire the actions and evaluate the computation statements. When all the actions are firable, the actions and the statements will be fired in certain order: category 1 OSM actions are evaluated first, followed by category 3, then the statements, category 4 actions, and finally category 2 actions. The general rule for action ordering is allocation/inquire first, read second, write third and release/discard last. Such order enables data-flow between token managers to occur within a single control step. According to these rules, the actions associated with the edge "e_id_ex" in the above example follow the order:
All these actions occur within one control step. One value is read from token manager mRF, then negated and written back to token manager mRF. Note that there should be no explicit control dependency among the actions on one edge. The reason is that the firing of the actions depends on the outcome of the condition tests. Only when all conditions test true can the actions be fired. If the firing condition of an action depends on the firing result of another action, there will be cyclic dependency between the test and the firing. The code below shows examples of such control dependency. The first three edges are illegal since the second action depends on the first one in each case. edge1: {ind = *m1[], *m2[ind]}; #illegal edge2: {buf1 = m3[], !buf1}; #illegal edge3: {v1 = *m4[], v1>10}; #illegal edge4: {v2 = *m5[], *m6[] = v2}; #legal, data dependency is fine edge5: {buf2 = m6[], !!buf1}; #legal, since discard is unconditional Also note that an edge may contain actions annotated by multiple syntax operations at a time. The ordering rule and control dependency rule applies to all actions across operation boundaries. The category-based ordering rule guarantees that data flow is well-preserved regardless of the which syntax operation that an action comes from. The statements from different syntax operations are fired according to the binding order of the statements. Recall that binding occurs at decoding time. So for the example operation below, if its decoding statement on edge "e_if_id" resolves to an "mvn" operation as shown in previous examples, the "mvn" will annotate its actions on the skeleton. Obviously the annotation occurs later than that of its parent "dpi". So when edge "e_id_ex" is evaluated, the statement "foo=10" will precede "v_rd = -v_rs". OPERATION dpi VAR oper: {mov, mvn}; iw : uint<32>; foo : uint<32>; EVAL e_if_id: {iw = *mIF[]} +oper = iw; e_id_ex: foo = 10; |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Data TypesMADL supports the following basic types:
MADL supports the following complex types:
The syntax of data types is shown below. data_type ::= "void" | basic_type | complex_type basic_type ::= "int" '<' integer '>' | "uint" '<' integer '>' | "float" | "double" | "string" complex_type ::= basic_type '[' integer ']' | '(' (basic_type ',')* basic_type ')' func_type ::= '(' (basic_type '*'? ',')* basic_type '*'? ')' Implicit conversion between types is supported by MADL. The following implicit conversions are valid:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Basic OperatorsThe basic operators are grouped according to their precedence levels listed in the table below. Highest precedence operators appear first.
Operator precedence here is similar to that of ANSI-C operators. '(' and ')' can be used with the highest precedence. An MADL statement is either an assignment operation or a function call. Arithmetic and comparison operators can be applied to numerical types including integer and floating-point. Logical and bit operators can be applied to integer types only. Addition (means concatenation) and comparison of string-typed operands are supported. For details about the modifier operators, please see the section below. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Modifiers
Modifiers can be used to refer to the syntax and encoding of any or-node variable. For assembly syntax, use "var_name.syn". For encoding, use "var_name.cod". The result type will have the same width as the variable's width. Modifiers can also be used to convert numerical variables or expressions to string type. An integer variable/expression can be appended with ".hex", ".dec", ".oct" or ".bin" (hexadecimal, decimal, octal, binary) modifiers so that it is converted to a formatted string. Similarly, floating-point variables/expressions can be appended with ".sci" or ".fix" (scientific, fixed) modifiers for the same purpose. Finally, modifiers can be used to convert (literally) between integer and floating point values. ".flt" converts 32-bit or 64-bit integer to float or double typed values. ".bit" does the reverse. Note that such conversion is different from a normal arithmetic conversion. This is a literal conversion. All bit values remain the same after such a conversion. OSM Actions
Allocate' in above table means temporary allocate. It is equivalent to an allocate followed by a discard in one cycle. It is a syntax sugar for the convenience of model specification. The comparison operators are the same as C comparison operators Note that except assignment, basic operators are not supported in the OSM action specification. Computation can always be moved into the statements. Implicit type conversion is allowed in OSM actions. This includes type conversion for both indexes and values. It is valid to combine read and write in ways such as "*manager1[index1] = *manager2[index2];". |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Annotation SyntaxAnnotations appear as paragraphs in an MADL description. Below is the syntax of of an annotation paragraph in Backus-Naur Form. annot_paragraph ::= claus* | ':' identifier ':' claus_list //with namespace claus ::= decl | stmt decl ::= "var" identifier ':' type ';' //variable | "define" identifier value ';' //macro stmt ::= identifier '(' arg_list ')' ';' //command | val op val //relationship arg ::= identifier = value val ::= identifier | number | string | '(' (val ',')+ val ')' // tuple | '{' (val ',')* val '}' // set typ ::= "int" '<' integer '>' | "uint" '<' integer '>' | "string" | '(' (typ ',')+ typ ')' // tuple type | '{' (typ ',')* typ '}' // set type An annotation paragraph contains an optional namespace label and a list of declarations and statements. The label specifies the tool-scope of the paragraph and can be used to filter irrelevant annotations. Paragraphs without a label belong to the global namespace. In an MADL description, an annotation paragraph can either be in a single-line format or in a block format. The former is preceded by a ``$'' and runs through the end of the line while the latter is enclosed within a pair of ``$$''s. An annotation paragraph can be attached to any command, newly defined skeleton name, state, edge, variable, buffer, manager class, manager instance, syntax operation name, function name, statement, action, SYNTAX subsection, CODING subsection, and edge name reference in TRANS subsection. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|