Next Previous Contents

2. Keywords description language

A keywords description describes mainly keywords (or standard identifiers) of a language and possibly the keywords frequencies in typical program. The frequencies are used for generation of pruned O-trie for faster keyword recognition in the program. The default action returning code of recognized keyword can be changed by user defined action, e.g. returning pointer to the structure describing given keyword.

2.1 Layout of keywords description

Keywords description structure has the following layout which is similar to one of YACC file.

     DECLARATIONS
     %%
     KEYWORDS
     %%
     ADDITIONAL C/C++ CODE
The `%%' serves to separate the sections of description. All sections are optional. The first `%%' starts section of keywords and is obligatory even if the section is empty, the second `%%' may be absent if section of additional C/C++ code is absent too.

The section of declarations contains optional declaration of type of generated function for keyword recognition. If the type is given then actions for all keywords and keyword `%other' with the action should be given. Otherwise the function will return codes of recognized keywords (of type `int' or `enum' -- see SHILKA Usage). The section of declarations may contain subsections of code on C/C++.

The next section contains keywords list which describes keyword itself and optional code of the keyword, frequency, and action.

The additional C/C++ code can contain any C/C++ code you want to use. Often functions which are not generated by the translator but are relative to the keyword recognition can go here. This code without changes is placed at the end of file generated by the translator.

Full YACC syntax of SHILKA description file is placed in Appendix 1.

2.2 Declarations

The section of declarations may contain the following construction:

     %type IDENTIFIER
Only the first such construction is taken into account. All subsequent constructions `%type' are ignored.

Such constructions declare type of data recognized by function `KR_find_keyword'. E.g. the function can return pointer to structure which contains additional information of recognized keyword. By default the function returns only code of recognized keyword (of integer type or enumeration -- see SHILKA Usage). If construction `%type' is present, we should also put construction `%other' in the keyword section and actions returning data of the declared type for each keyword definition in the section of keywords. Otherwise SHILKA will generate warning messages.

There may be also the following constructions in the declaration section

     %local {
        C/C++ DECLARATIONS
     }

     %import {
        C/C++ DECLARATION
     }

     and

     %export {
        C/C++ DECLARATION
     }
which contain any C/C++ declarations (types, variables, macros, and so on) used in sections.

The local C/C++ declarations are inserted at the begin of generated implementation file (see section `generated code') but after include-directive of interface file (if present -- see SHILKA Usage).

C/C++ declarations which start with `%import' are inserted at the begin of generated interface file. If the interface file is not generated, the code is inserted at the begin of the part of implementation file which would correspond the interface file.

C/C++ declarations which start with `%export' are inserted at the end of generated interface file. For example, such exported C/C++ code may contain definitions of external variables and functions which refer to definitions generated by SHILKA. If the interface file is not generated, the code is inserted at the end of the part of implementation file which would correspond the interface file.

All C/C++ declarations are placed in the same order as in the section of declarations.

2.3 Keywords

The section of declarations is followed by section of keyword definitions. Usually keyword definition is described by the following construction

     IDENTIFIER = IDENTIFIER frequency action
What is identifier. What is action.

Here the first identifier is a keyword itself. The second identifier determines the keyword name which will be generated as macro definition or enumeration (see SHILKA usage) for given keyword. The name is formed from the prefix KR_ and the second identifier. The second identifier with = may be absent. In this case the keyword name is formed from prefix KR_ and the keyword itself (i.e. the first identifier).

Optional frequency of the keyword occurence in program (in some abstract unities) is integer value and is used for generation of minimal cost pruned O-trie. If the frequency is absent, its value will be 1.

Optional action (any C/C++ block) is executed when the keyword is recognized. Usually the action will be only returning pointer to the structure describing the recognized keyword. Default action is simply returning the keyword name whose value is keyword code defined as macro or enumeration constant. If you are going to return data distinct from the keyword code, we should use declaration %type (see above) and should define actions for all keywords and for special keyword definition %other with action (see below). Otherwise SHILKA will generate warning messages.

Identifier in SHILKA is the same as one in C. If you want to recognize also strings distinct from SHILKA identifier, you should use the following construction

     STRING = IDENTIFIER frequency action
The string here is C string. The identifier after = is obligatory in such construction. All other is the same as in the previous construction.

The following construction determine optional action when no one keyword is recognized.

     OTHER action
Default action is only returning special code KR__not_found. If you are going to return data distinct from the keyword code, we probably should return value NULL here.

Only the first such construction is taken into account. All subsequent such constructions are ignored.


Next Previous Contents