Next Previous Contents

8. Predeclared identifiers

Dino has quite a lot of predeclared identifiers. The section Declarations and Scope Rules contains them in alphanumeric order. Here they are described according to the declaration category which they belongs to.

8.1 Predeclared variables

Dino has some predeclared variables which contain useful information or can be used to control the behaviour of the Dino interpreter.

Arguments and environment

To access arguments to the program and the environment, the following variables can be used:

Version

As Dino is a live programming language, it and its interpreter are in the process of permanent development. To access the Dino interpreter's version number and consequently the language version, the final variable version can be used. The variable value is the Dino version as a floating point number. For example, if the current Dino version is 0.54, the variable value will be 0.54.

Threads

To access some information about threads in Dino program, the following variables can be used.

All these variables are final, so you can not change their values.

Exceptions

When it is necessary to create an exception which is a object of a class declared inside class except or when it is necessary to refer to a class inside class except, the following variables can be used. Instead of typing catch (except().signal().sigint), you could type catch (signals.sigint).

All these variables are final, so you can not change their values.

Files

To output something into standard streams or to input something from the standard input stream, the following variables can be used:

All these variables are final, so you can not change their values.

Miscellaneous variables

Values of the following variables are used by some predeclared functions:

8.2 Predeclared classes

The most of the predeclared classes describe exceptions which may be generated in Dino program.

File

Dino has predeclared final class file. Work with files in Dino program are made through objects of the class. All declarations inside of class are private. The objects of the class can be created only by predeclared functions open or popen. If you create an object of the class by calling the class, exception callop will be generated.

Exception classes

All Dino exceptions are represented by objects of the predeclared class except or of a class in the class except. The class except has no parameters, therefore all arguments in calling the class will be ignored. There is one predeclared class error inside class except. All classes corresponding to user-defined exceptions are suggested to be declared in class except not in the class error because all other exceptions (e.g. generated by the Dino interpreter itself or by predeclared functions) are objects of the class error or predeclared classes inside the class error. The class error and all classes inside the class has one parameter msg which contains a readable message about the exception. The following classes are declared in the class error:

Earley parser classes

Dino has the three following classes which are used by the Earley parser embedded into the Dino interpreter.

Parser.

Dino has predeclared final class parser which implements the Earley parser. The Earley parser is a very powerful tool to implement serious language compilers, processors, or translators. The implementation of the Earley parser used in Dino has the following features:

The following public functions and variables are declared in the class parser:

The call of the class parser itself can generate exception pmemory if there is no memory for internal parser data.

Token.

Dino has a predeclared class token. Objects of this class should be the input of the Earley parser (see function parse in class parser). The result abstract tree representing the translation will have input tokens as leaves. The class token has one public variable code whose value should be the code of the corresponding terminal described in the grammar. You could extend the class description e.g. by adding variables whose values could be attributes of the token (e.g. source line number, name of an identifier, or value for a number).

Anode.

Dino has a predeclared class anode whose objects are nodes of the abtract tree representing the translation (see function parse of class parser). Objects of this class are generated by the Earley parser. The class has two public variables name whose value is string representing name of the abstract node as it given in the grammar and transl whose value is array with abstract node fields as the array elements. There are a few node types which have special meaning:

Nil_anode and error_anode.

There is only one instance of anode which represents empty (nil) nodes. The same is true for the error nodes. The final variables nil_anode and error_anode correspondingly refer to these nodes.

Example of Earley parser usage.

Let us write a program which transforms an expression into postfix polish form. Please, read the program comments to understand what the code does. The program should output string "abcda*+*+" which is the postfix polish form of input string "a+b*(c+d*a)".

          // The following is the expression grammar:
          var grammar = "E : E '+' T   # plus (0 2)\n\
                           | T         # 0\n\
                           | error     # 0\n\
                         T : T '*' F   # mult (0 2)\n\
                           | F         # 0\n\
                         F : 'a'       # 0\n\
                           | 'b'       # 0\n\
                           | 'c'       # 0\n\
                           | 'd'       # 0\n\
                           | '(' E ')' # 1";
          // Create parser and set up grammar.
          var p = parser ();
          p.set_grammar (grammar, 1);

          // Add attribute repr to token:
          ext token { var repr; }
          // The following code forms input tokens from string:
          var str = "a+b*(c+d*a)";
          var i, inp = [#str : nil];
          for (i = 0; i < #str; i++) {
            inp [i] = token (str[i] + 0);
            inp [i].repr = str[i];
          }
          // The following function output messages about syntax errors
          // syntax error recovery:
          func error (err_start, err_tok,
                      start_ignored_num, start_ignored_tok_attr,
                      start_recovered_num, start_recovered_tok) {
            put ("syntax error on token #", err_start,
                 " (" @ err_tok.code @ ")");
            putln (" -- ignore ", start_recovered_num - start_ignored_num,
                   " tokens starting with token #", start_ignored_num);
          }

          var root = p.parse (inp, error); // parse

          // Output translation in polish inverse form
          func pr (r) {
            var i, n = r.name;

            if (n == "$term")
              put (r.transl.repr);
            else if (n == "mult" || n == "plus") {
              for (i = 0; i < #r.transl; i++)
                pr (r.transl [i]);
              put (n == "mult" ? "*" : "+");
            }
            else if (n != "$error") {
              putln ("internal error");
              exit (1);
            }
          }

          pr (root);
          putln ();

8.3 Predeclared functions

The predeclared functions expect a given number of actual parameters (may be a variable number of parameters). If the actual parameter number is an unexpected one, exception parnumber is generated. The predeclared functions believe that the actual parameters (may be after implicit conversions) are of the required type. If this is not true, exception partype is generated. To show how many parameters the function requires, we will write the names of the parameters and use brackets [ and ] for the optional parameters in the description of the functions.

Examples: The following description

          strtime ([format [, time]])
describes that the function can accept zero, one, or two parameters. If only one parameter is given, then this is parameter format.

If something is not said about the returned result, the function returns the default value nil.

Mathematical functions

The following functions make implicit arithmetic conversion of the parameters. After the conversions the parameters are expected to be of integer or floating point type. The result is always a floating point number.

Pattern matching functions

Dino has the predeclared functions which are used for pattern matching. The pattern is described by regular expressions (regex). The pattern has syntax of extended POSIX (1003.2) regular expressions, i.e. the pattern has the following syntax:

          Regex = Branch {"|" Branch}
A regex matches anything that matches one of the branches.
          Branch = {Piece}
A branch matches a match for the first piece, followed by a match for the second piece, etc. If the pieces are omitted, the branch matches the null string.
          Piece = Atom ["*" | "+" | "?" | Bound]

          Bound = "{" Min ["," [Max]] "}" | "{" "," Max "}"

          Min = <unsigned integer between 0 and 255 inclusive>

          Max = <unsigned integer between 0 and 255 inclusive>
An atom followed by * matches a sequence of 0 or more matches of the atom. An atom followed by + matches a sequence of 1 or more matches of the atom. An atom followed by ? matches a sequence of 0 or 1 matches of the atom.

There is a more general construction (a bound) for describing repetitions of an atom. An atom followed by a bound containing only one integer min matches a sequence of exactly min matches of the atom. An atom followed by a bound containing one integer min and a comma matches a sequence of min or more matches of the atom. An atom followed by a bound containing a comma and one integer Max matches at most Max repetitions of the atom. An atom followed by a bound containing two integers min and max matches a sequence of min through max (inclusive) matches of the atom.

          Atom = "(" Regex ")"
               | "(" ")"
               | "."
               | "^"
               | "$"
               | BracketedList
               | "\^"
               | "\["
               | "\$"
               | "\("
               | "\)"
               | "\*"
               | "\+"
               | "\?"
               | "\{"
               | "\."
               | <any pair the first character is \ and the second is any
                  except for ^.[$()|*+? >
               | <any character except for ^.[$()|*+? >
A regular expression enclosed in () can be an atom. In this case it matches a match for the regular expression in the parentheses). The atom () matches the null string. The atom . matches any single character. Atoms ^ and $ match correspondingly the null string at the beginning of a line and the null string at the end of a line.

An atom which is \ followed by one of the characters ^.[$()|*+?{\ matches that character taken as an ordinary character. Atom which is \ followed by any other character matches the second character taken as an ordinary character, as if the \ had not been present. So you should use \\ for matching with a single \. An atom which is any other single character matches that character. It is illegal to end a regular expression with \. There is an exception which is not described by the atom syntax. An { followed by a character other than a digit or comma is an ordinary character, not the beginning of a bound and matches the character {.

          BracketedList = "[" List "]"

          List = FirstChar ["-" Char] {Char ["-" Char]}

          FirstChar = <any character except for ^ - and ]>
                    | CollatingElement

          Char = FirstChar
               | "^"

          CollatingElement = "[:" Class ":]"

          Class = "alnum"
                | "alpha"
                | "blank"
                | "ctrl"
                | "digit"
                | "graph"
                | "lower"
                | "print"
                | "punct"
                | "space"
                | "upper"
                | "xdigit"
An atom can be a bracket expression which is a list of characters enclosed in []. Usually it is used to match any single character from the list. If the list begins with ^, it matches any single character (but see below) not in the list. If two characters in the list are separated by -, this is shorthand for the full range of characters between those two (inclusive) in the collating sequence of ASCII codes, e.g. [0-9] matches any decimal digit. It is illegal for two ranges to share an endpoint, e.g. a-c-e.

There are exceptions which are not described by the atom syntax which is used to include a literal ] in the list by making it the first character (following a possible ^). To include a literal -, make it the first or the last character, or the second endpoint of a range. As you can see from the syntax, all special characters (except for [) described in an atom lose their special significance within a bracket expression.

A collating element is a name of a character class enclosed in [: and :]. It denotes the list of all characters belonging to that class. Standard character class names are:

       alnum       digit       punct
       alpha       graph       space
       blank       lower       upper
       cntrl       print       xdigit
These names stand for the character classes defined in the ANSI C include file ctype.h. There is an exception not described by the syntax: a character class can not be used as an endpoint of a range.

There is an extension of regular expressions used by DINO and of ones defined in Posix 1003.2: no particular limit is imposed on the length of the regular expression.

There are the following Dino pattern matching functions:

If the regular expression is incorrect, the functions generate one of the following predeclared exceptions (see predeclared classes):

File functions

Dino has some predeclared functions to work on files and directories.

Functions for access to file/directory information

The following predeclared functions can be used for accessing file or directory information. The functions may generate an exception declared in the class syserror (e.g. eaccess, enametoolong, enfile and so on) besides the standard partype, and parnumber. The functions expect one parameter which should be a file instance (see the predeclared class file) or the path name of a file represented by a string (the functions make implicit string conversion of the parameter). The single exception to this is isatty which expects a file instance.

The following functions can be used to change rights of usage of the file (directory) for different users. The function expects two strings (after implicit string conversion). The first one is the path name of the file (directory). The second one is the rights. For instance, if the string contains a character 'r', this is right to read (see characters used to denote different rights in the description of the function fumode). The functions always return the value nil.

Functions for work with directories

The following functions work with directories. The functions may generate an exception declared in class syserror (e.g. eaccess, enametoolong, enotdir and so on) besides the standard partype, parnumber.

Functions for work with files.

The following functions (besides input/output functions) work with OS files. The functions may generate an exception declared in the class syserror (e.g. eaccess, enametoolong, eisdir and so on) besides the standard partype, and parnumber. The function rename can be used for renaming a directory, not only a file.

File output functions

The following functions are used to output something into opened files. All the functions always return the value nil. The functions may generate an exception declared in the class syserror (e.g. eio, enospc and so on) besides the standard partype, and parnumber.

File input functions

The following functions are used to input something from opened files. All the functions always return the value nil. The functions may generate an exception declared in the class syserror (e.g. eio, enospc and so on) or eof besides the standard partype, and parnumber.

Time functions

The following functions can be used to get information about real time.

Functions for access to process information

There are Dino predeclared functions which are used to get information about the current OS process (the Dino interpreter which executes the program). Each OS process has unique identifier and usually OS processes are called by a concrete user and group and are executed on behalf of the concrete user and group (so called effective identifiers). The following functions return such information. On some OSes the function may return string "Unknown" as a name if there are notions of user and group identifiers.

Miscellaneous functions

There are the following miscellaneous functions:


Next Previous Contents