A specification as described in the previous section is translated by MSTA into optional interface and implementation files having the same names as one of specification file and correspondingly suffixes `.h' and `.c' (C code) or `.cpp' (C++ code). By default the interface file is not generated.
The interface and implementation files consist of the following definitions of generated macros, types, and functions (unless special information for MSTA scanner is mentioned, MSTA scanner object have the same sense and names with additional s or S after the prefix `yy' or `YY'):
By default this is macro. The macro value is type used for representing the parser attributes. By default this macro is defined as `int'. You can redefine the macro if you place definition of the macro before standard definition of the macro.
If construction `%union' is present in the specification file, YYSTYPE is type definition of union with the code written inside construction `%union'.
The definition of YYSTYPE is placed in the interface file if option `-d' is on MSTA command line. Otherwise the definition will be in the implementation file. YYSTYPE is a part of YACC POSIX standard.
This variable contains code of the current token. The current token is not latest read token because MSTA can look ahead far. The codes are returned by scanner function `yylex'. If option `-d' is present on the command line (see MSTA usage), external definition of the variable is also placed in the interface file. The variable is a part of YACC POSIX standard.
This variable is used to exchange information of the parser with a scanner. The scanner must return attribute of the latest read token in this variable. After that the variable contains attribute of the current token, i.e. whose code is in the variable `yychar'. The variable `yylval' is declared of type YYSTYPE. If option `-d' is present on the command line (see MSTA usage), external definition of the variable is also placed in the interface file. The variable is a part of YACC POSIX standard.
The parser generated by MSTA has code for diagnostics. The compilation of the runtime debugging code is under the control of YYDEBUG, a preprocessor symbol. If YYDEBUG has a nonzero value, the debugging code will be included. If its value is zero, the code will not be included. The macro is a part of YACC POSIX standard.
In parser where the debugging code has been included (see macro YYDEBUG), the variable `yydebug' can be used to turn debugging on (with a nonzero value) and off (zero value) at run time. The initial value of yydebug is zero. If option `-d' is present on the command line (see MSTA usage), external definition of the variable is also placed in the interface file. The variable is a part of YACC POSIX standard.
This function is main function of MSTA parser. The function makes parsing of the token sequence whose codes are returned by user-defined function `yylex' and whose attributes if any are placed in variable `yylval'. The function returns 0 if the parser successfully finished work. Nonzero returned status means that the parser found unrecoverable errors (or macro YYABORT was executed explicitly). This function is a part of YACC POSIX standard.
This function has name `yylex' for MSTA scanner. The function makes scanning of the character (token in terminology of MSTA specification file) sequence whose codes are returned by function `yyslex' and whose attributes if any are placed in variable `yyslval'. The function returns 0 if the parser successfully finished work and reach end of input file stream. Negative returned status means that the parser found unrecoverable errors (or macro YYSABORT was executed explicitly). This function can be called many times for getting next token. Code of the next token is suggested to returned by statements `return' in the actions. Input stream (look ahead characters) is saved from a call of `yylex' to the next its call.
This function is an external function to the MSTA parser. User must provide it. Each call of the function should return code of the next input token. If end of input is reached, the function should return zero (-1 for `yyslex). Attribute of token whose code returned by the function should be returned by the function through variable `yylval'. In the case of MSTA scanner, function `yyparse' has name `yylex'.
The function `yylex_start' is generated only for MSTA scanner. The function should be used for initiation of the scanner. Nonzero value returned through the parameter means that ther was error in memory allocation for the scanner (this is a fatal error). The function is not a part of YACC POSIX standard.
Its value is the latest shifted token (character) code. Usually the value is used for forming internal representation of tokens (e.g. identifier internal representation or number value). The variable is not a part of YACC POSIX standard.
The macro YYACCEPT will cause the parser to return with the value zero. This means normal parser work finish. The macro is a part of YACC POSIX standard.
The macro YYABORT will cause the parser to return with a nonzero value (1 for MSTA parser and -1 for macro YYSABORT MSTA scanner). This means abnormal parser work finish. The macro is a part of YACC POSIX standard.
When the parser detects a syntax error in its normal state, it normally calls external function yyerror with string argument whose value is defined by macro YYERROR_MESSAGE. User must provide function `yyerror' for building parser program. After that the parser jumps to recovery mode. The parser is considered to be recovering from a previous error until the parser has shifted over at least YYERR_RECOVERY_MATCHES normal input tokens since the last error was detected or a semantic action has executed the macro `yyerrok'. The function is a part of YACC POSIX standard.
Recovery mode consists of on or more steps. Each recovery step starts with searching for the uppest stack state on which the shift on special symbol `error' is possible. This state becomes the top stack state, and shift on `error' is made. After that the parser discards all tokens which can not be after the symbol `error' in this state (so called stop symbols). After that any recognized syntatic error results in the new error recovery step. This is technique of standard YACC error recovery. Such technique may result in infinite looping of the parser or discarding all input tokens if the stop symbols are not met.
By default MSTA generates the standard YACC error recovery. There are two additional methods which msta can generate for error recovery.
The first one is a local error recovery which does not permit infinite parser looping and use context after several error as stop symbols. According this method look ahead set also includes look ahead tokens after token `error' in states which have the `error' token is acceptable and which are lower in the parser stack than the first state with acceptable token `error'. In this case the feedback from the parser to the scanner could not work correctly because although rule actions are executed in such case the parser reads the tokens once.
The second one is a minimal cost error recovery where the cost is overall number of tokens ignored. The feedback from the parser to the scanner does not work correctly. So you shouldn't use this method when there is the feedback. Calling `yyerrok' has no sense for such method because the parser in such recovery mode never executes the rule actions. This method is the best quality error recovery although it my be expensive method because in the worst case it might save all input tokens.
The macro value is used as a parameter of function yyerror when a syntax error occurs. The default value of macro is "syntax error" ("lexical error" for a scanner). You can redefine its value. But in any case the value should be a string. The macro is not a part of YACC POSIX standard.
The parser is considered to be recovering from a previous error until the parser has shifted over at least YYSERR_RECOVERY_MATCHES normal input tokens since the last error was detected or a semantic action has executed the macro `yyerrok'. The default value of macro is 3. You can redefine its value. But in any case the value will be positive. This macro is not a part of YACC POSIX standard.
This macro is generated only when the local recovery mode is used. The default value is 7. This value can not be less 1. See description below.
This macro is generated only when the local error recovery mode is used. The default value is 3. This value can not be less 0. See description below.
This macro is generated only when the local error recovery mode is used. The default value is 2. This value can not be less 0. See description below.
This macro is generated only when the local error recovery mode is used. The default value is 3. This value can not be less 0. See description below.
This and the previous macros (YYERR_MAX_LOOK_AHEAD_CHARS - YYERR_DISCARDED_CHARS) are used only when the local error recovery is generated. Before starting description of the local error recovery, let me remind how YACC error recovery works. When the parser recognizes a syntactic error, it switches into error recovery mode. Error recovery itself consists of one or more steps. Each step consists of finding the top state on the stack with possible shift on pseudo-token `error', throwing all states upper the state with `error', and making shift on the `error' token. After that all token are discarded until token (so called stop symbol) which can be after the pseudo-token `error' is read. After that any recognized error results in the local error recovery step. And finally the error recovery is switched off only when YYERR_RECOVERY_MATCHES (by default 3) tokens are shifted without occurring syntactic error.
The differences of the local error recovery from classic YACC error recovery is in the following:
#define YYERR_END_RECOVERY() yyerr_end_recovery()
...
program :
| program function
...
function : ...
| error END FUNCTION
{yyerror ("error in function");}
...
statement : ...
| error
{
yyerror ("error in statement");
...
}
...
expression : ...
| error
{
yyerror ("error in expression");
...
}
...
yyerror (char *s)
{
/* save string s */
}
yyerr_end_recovery ()
{
/* print last saved error message. */
}
Note that action for error rule for function does not use
macro `yydeeper_error_try', this is warranty that the all
program will be processed.
The macro YYRECOVERING serves to determine in which state the parser works now. The macro returns 1 if a syntax error has been detected and the parser has not yet fully recovered from it. Otherwise, zero is returned. The macro is a part of YACC POSIX standard.
The parser detects a syntax error when it is in a state where the action associated with the lookahead symbol(s) is error. A semantic action can cause the parser to initiate error handling by executing the macro YYERROR. When YYERROR is executed, the semantic action passes control back to the parser. YYERROR can be placed only in the semantic action itself (not in a function called from the semantic action). The single difference between error detected in the parser input and error caused by macro YYERROR is that the function `yyerror' is not called in the second case. The macro is a part of YACC POSIX standard.
Actually this variable contains the number of switching the parser state from normal to error recovery. This switching is performed by fixing error in the input or by executing macro YYERROR. The macro is not a part of YACC POSIX standard. In the case of MSTA scanner, the variable accumulates the number for all calls of `yylex'.
This macro can be used only in a semantic action itself. The macro causes the parser to act as if it has fully recovered from any previous errors. The macro is a part of YACC POSIX standard. The macro has no sense for minimal error recovery method because the parser in such recovery mode never executes the rule actions.
This macro is called when the parser switches from the recovery state into normal state. By default the macro does nothing. You can redefine this macro, e.g. to output the last error buffered by your `yyerror' function in order to implement better error diagnostics of the parser in the local recovery mode. The macro is not a part of YACC POSIX standard and the macro is not generated when yacc error recovery is used.
The token error is reserved for error handling. The name error can be used in grammar rules. It indicates places where the parser can recover from a syntax error. The default value of error shall be 256. Its value can be changed using a %token declaration. In any case the code of token error is value of macro YYERRCODE.
This macro cause the parser to discard the current lookahead token. If the current lookahead token has not yet been read, yyclearin has no effect. The macro is a part of YACC POSIX standard.
MSTA uses memory allocation for the state and attribute stacks. Moreover, stacks can be expandable (with the aid of YYREALLOC). The macro values are used for the stack memory allocation/reallocation/freeing. Default value of the macros are standard C functions malloc, realloc, free. You can redefine this value. The macros are not a part of YACC POSIX standard.
The macro value is initial size of state and attribute stacks of the parser. If a stack become overfull, macro YYABORT is executed when option -no-expand is used. Otherwise, the stacks are expanded. It is better to use left recursion in grammar rules in order to do not make overfull stacks. Default value of the macro is 500. You can redefine this value. The macro is not a part of YACC POSIX standard.
The macro value is maximal size of state and attribute stacks of the parser. The macro is used when the stacks are expandable. If a stack size become bigger (may be after several stacks expansions), macro YYABORT is executed. Otherwise, the stacks are expanded. Default value of the macro is 5000. You can redefine this value. The macro is not a part of YACC POSIX standard.
The macro value is step of state and attribute stacks expansion. The macro is used only when the stacks are expandable. Default value of the macro is 100. You can redefine this value. The macro is not a part of YACC POSIX standard.
This macro defined as 1 is generated in order to differ the parser generated by YACC, BISON, or MSTA. Naturally the macro is not a part of YACC POSIX standard.
This macro returns printable representation of token with given code. The macro is not a part of YACC POSIX standard.
This macro value is maximal code of tokens. the macro is not a part of YACC POSIX standard.
The major advantage of C++ code is that it is quite easy to create many parsers of one language (and consequently reenterable parser). This is useful for implementation of module languages and languages with macro directives of type of C include directive.
Generated C++ code is different from C code in the following features:
yyscanner (int &)
replaces the function.
Usually the parser (scanner) itself is implemented as sub-class of class `yyparser' (`yyscanner'). This subclass contains definition of functions `yylex' (`yyslex') and `yyerror' (`yyserror').