IRD declares types of nodes of the graph. Nodes contains fields, part of them represents links between nodes, and another part of them stores attributes of arbitrary types. To make easy describing internal representation the IRD supports explicitly single inheritance in node types and also can model multiple inheritance. There can be several levels of internal representation description in separate files. The nodes of one level refer to the nodes of previous levels. Therefore each next level enriches source program internal representation.
To describe internal representation a special language is used. An internal representation description structure has the following layout which is similar to one of YACC file.
DECLARATIONS
%%
TYPES OF NODES
%%
ADDITIONAL C/C++ CODE
The `%%' serves to separate the sections of description. All sections
are optional. The first `%%' starts section of description of types
of internal representation nodes and is obligatory even if the section
is empty, the second `%%' may be absent if section of additional C/C++
code is absent too.
The section of declarations may contain names of predefined types of fields of internal representation nodes and names of types of double linked nodes. The section also contains name of original internal representation description if given file contains extension of an internal representation description. And finally the section may contains sections of code on C/C++.
The next section contains description of types of internal representation nodes.
The additional C/C++ code can contain any C/C++ code you want to use. Often functions which are not generated by the translator but are needed to work with internal representation go here. This code without changes is placed at the end of file generated by the translator.
The section of declarations may contain the following construction.
%type IDENTIFIER ...
All predefined types must be defined in constructions of such kind.
The same name can be defined repeatedly. All references to node whose
type is a sub-type of type with name present in the following
construction will be double linked.
%double IDENTIFIER ...
It means that SPI will generates functions (macros) which permit to
examine all fields (may be in other nodes) described as of given node
type (or its subtype) which refer to node of given type (or its
sub-type), i.e. fields which are described as of given node type (or
its subtype) will be double linked (see below). SPI will
automatically maintain such double links. The simplest way to
describe double linked graph is to insert construction
%double %root
Last construction in the section of declarations of kind
%extend IDENTIFIER
defines name (without suffix) of file containing source internal
representation which is extended by given file. The file contains
original internal representation if there is no one such construction.
The original specification file has level 0, the extensions have level
1, 2, and so on. This feature permits sequentially to develop an
internal representation and to save and restore any its level (see
SPI) to additional tools, e.g. browsers. For example, there may be
three levels of source program internal representation. The zero
level representation may be internal representation for semantic
analysis, the first level may be a low level machine-dependent
internal representation, and the second level may be used to generate
object code. Only first such construction in one file is essential.
All subsequent such constructions are ignored.
There may be also the following constructions in the declaration section
%local {
C/C++ DECLARATIONS
}
%import {
C/C++ DECLARATION
}
and
%export {
C/C++ DECLARATION
}
which contain any C/C++ declarations (types, variables, macros, and so
on) used in the description sections.
The local C/C++ declarations are inserted at the begin of generated implementation file (see SPI description) but after include-directive of interface file.
C/C++ declarations which start with `%import' are inserted at the begin of generated interface file. For example, such C/C++ code may contain C/C++ definitions of predefined types which are used in field declarations of node types.
C/C++ declarations which start with `%export' are inserted at the end of generated interface file. For example, such C/C++ code may contain definitions of external variables and functions which refer to node type representation (see type `IR_node_t' in SPI description).
All C/C++ declarations can redefine all type specific and internal macros (see SPI) because they are placed in implementation file. All C/C++ declarations are placed in the same order as in the section of declarations. C/C++ declarations from IRD file with smaller level number (see construction `%extend') are placed in interface or implementation files firstly.
The section of declarations is followed by section defining internal representation node types. An internal node type is described by the following construction
%abstract
IDENTIFIER :: IDENTIFIER (or %root)
CLASS FIELDS
SKELETON FIELDS
OTHER FIELDS
Keywords `%abstract' is optional. Node type description which starts
with this keyword denotes abstract node type, i.e. node of such type
does not exist in internal representation of any source program.
Abstract nodes types serve only to description of common fields of
several node types.
The first identifier defines name of internal representation node type, the second defines name of node type all declarations of fields of which are inherited into the given node type. In this construction the first node type is so called immediate super type, the second is immediate sub-type.
Node type A is a super-type of node type B (and node type B is a sub-type of node type A) iff node type A is immediate super-type of super-type of node type B. All node types are sub-types of implicitly declared node type with name `%root'. There are also nodes of special type (error nodes). Type of these nodes are believed to be sub-type of all declared node types. The definition of type of error nodes are absent in any internal representation description.
The identifier of immediate super-type (with `::') can be absent. In this case the construction is continuation of given type node declaration. There can be only the single main node type declaration and many its continuations. The order of main node type declaration and its continuations can be arbitrary. The continuations can not start with keyword `%abstract'.
Construction of the following kind
A, B, ... :: C
DECLARATIONS OF FIELDS
is abbreviation of the following constructions
A :: C
DECLARATIONS OF FIELDS
B :: C
DECLARATIONS OF FIELDS
...
The fields are sub-divided on kinds. There are three kinds of fields.
%class LIST OF DECLARATIONS OF FIELDS
%skeleton LIST OF DECLARATIONS OF FIELDS
%other LIST OF DECLARATIONS OF FIELDS
These keywords are followed by may be empty list of declarations of
fields. The list elements are field declaration or target code.
Field declaration is described by the following construction:
IDENTIFIER : FIELD %double TYPE CONSTRAINTS ACTIONS
Identifier is name of given field. All declarations of fields of a
node type must have unique names. But declarations of fields of
different node types can have the same names if the types of such
fields are the same, the fields are simultaneously described as double
linked or not, and all such fields are class or any non-class fields.
Owing to the feature it is possible to model multiple inheritance.
Semicolon `:' is followed by the field type. There are the following
types of nodes fields:
The field type may be followed by constraints and actions in any order. The constraint is usually present for non-null value node reference. The constraint is a boolean expression on C/C++ in brackets `[' and `]'. The constraints are tested with the aid of some generated functions in the same order as they are present in corresponding type node declaration. The actions can contain any statements on C/C++ in figure brackets `{' and `}' (the brackets are also output therefore C/C++ declarations can be in actions). The actions for skeleton and other fields are fulfilled at the node creation time in the same order as they are present in corresponding type node declaration. The actions for class fields are fulfilled at the internal representation initiation time.
Constructions `$$', `$' in the constraints and the actions represent correspondingly current node and previous field. But `$' is not changed by the previous field if the previous field in given node type declaration is of other kind than the constraint or action (e.g. the field of skeleton kind and the constraint of class kind) or such field does not exist in the current node type declaration. Also it should be remembered that `$' can not be used in left hand side of assignment if the construction represent double linked fields.
The construction `IDENTIFIER : FIELD TYPE' may be absent. This case is convenient for definition of additional constraints and actions for fields which declared in a super-type of given node type.
Construction of kind
A, B, ... : C ...
is abbreviation of the following constructions
A : C ...
B : C ...
Full YACC syntax of internal representation description language is
placed in Appendix 1.