CHAPTER 1: Language Syntax The SPP language is based on the Ratfor language. Ratfor, in turn, is based on Fortran, with extensions for structured control flow, etc. The lexi- cal form, operators, and control flow constructs are identical to those pro- vided by Ratfor. The major differences are the data types, the form of a procedure, the addition of inline strings and character constants, the use of square brackets for arrays, and the task statement. In addition, the SPP I/O facilities provided are quite different and are tailored to the IRAF envi- ronment. The syntax of the SPP language is fairly straightforward and fun- damentally similar to most other high-level languages. While it is based on the Ratfor language, there are elements of C as well as elements of Fortran. SPP is a preprocessed language. That is, there is no SPP compiler per se, but it is translated into another compilable language. In fact, SPP is first translated into Ratfor, which is processed into Fortran. The xc compiler performs all preprocessing, compilation, and linkage. This chapter describes the language in detail. Chapter 2 describes the procedure libraries available to connect a program to the outside world, Chapter 4 describes how to compile an application as well as how it fits into the IRAF environ- ment. Appendix B presents some basic examples and hints for writing real software. Lexical Form An SPP program consists of a sequence of lines of text. The length of a line is arbitrary, but SPP is guaranteed to be able to handle only lines of up to 160 characters long. The end of each line is marked by a "newline" char- acter. Character set SPP uses the extended ASCII character set which includes the charac- ters listed in Table 1.1 Characters Type --------------------------------------------- a-z All lower case letters A-Z All upper case letters 0-9 All digits # _ &, etc. Special characters [tab], [space] White space --------------------------------------------- Table 1.1: SPP Character Set. Some of these may be used in identifier names and numeric constants. The remaining ones have specific meaning within the language. SPP does not distinguish between lower case and upper case except for literal strings (inside double quotes). Any character may be used in a literal string. The specific meaning of special characters is described in the appropriate sec- tion. White Space White space is defined as one or more tabs or spaces. A newline nor- mally marks the end of a statement, and is not considered to be white space. White space always delimits tokens, the smallest recognized ele- ments of the language. Keywords and operators will not be recognized as such if they contain embedded white space. However, the absolute amount of white space is not relevant and there is no enforced structure of text on the line. Indentation and judicious use of white space greatly improves readability. Note, however, that spaces, including trailing blanks, are sig- nificant in literal quoted strings such as text to be written to standard out- put. Comments Comments begin with the # character and end at the end of the line. That is, anything after a # is ignored by the preprocessor until the next end of line. Thus, in-line comments may follow SPP statements. Continuation Statements may span several lines. A line that ends with an operator (excluding /) or punctuation character (comma or semicolon) is automati- cally understood to be continued on the following line. Constants SPP supports several types of constants. These are described below. (Predefined constants are described in Appendix A.) Integer Constants A integer constant is a sequence of one or more of the digits in the range 0 through 9. An octal constant is a sequence of one or more of the digits in the range 0 through 7, followed by the letter b or B. A hexadeci- mal constant is one of the digits in the range 0 through 9, followed by zero or more of the digits 0 through 9, the letters in the range a through f, or the letters A through F, followed by the letter x or X. Note that a hexadeci- mal constant must begin with a decimal digit (zero through nine) to distin- guish it from an identifier. The notation shown in Table 1.2 more concisely summarizes these definitions. Integer Type Definition Examples --------------------------------------------------------------------------- Decimal [+|-][0-9]+ 42, -999, 0 Octal [+|-][0-7]+[b|B] 42b, 777B Hexadecimal [+|-][0-9][0-9a-fA-F]*[x|X] 0ffx, 0123ABCx -------------------------------------------------------------------------- Table 1.2: Integer Constant Notation. In the notation used above, + means one or more, * means zero or more, - implies a range, and | means "or". Brackets ([...]) define a class of characters. Thus, "[0-9]+" reads "one or more of the characters in the range 0 through 9." An integer constant has the same range as the range of the underlaying Fortran constant. Since this changes from machine to machine, SPP has the predefined constant MAX_INT as the maximum allowable integer (see Appendix A). Floating Point Constants A floating point constant (type real or double) consists of a decimal integer, optionally preceded by a sign (+ or -), followed by a decimal point, optionally followed by a decimal fraction, followed by one of the characters: e, E, d, D, followed by a decimal integer, which may be nega- tive. Either the decimal integer or the decimal fraction part must be present. The number must contain either the decimal point or the exponent (or both). Embedded white space is not permitted. The following are all legal floating point numbers: .01, 100., 100.01, 1E5, 1e-5, -1.00D5, 1.0d0. A complex constant consists of two floating point constants sep- arated by a comma and enclosed in parentheses representing the real and imaginary parts, (1.0,0.0) for example. A floating constant may also be given in sexagesimal, i.e., in hours and minutes, or in hours, minutes, and seconds, or any other units in which places of the number vary by a factor of sixty. Numerical fields are separated by colon characters (:) and there must be either two or three fields. The number of decimal digits in the second field and in the integer part of the third field is limited to exactly two. The decimal point and any fraction is optional. The low level proce- dures that parse input recognize this syntax as well, making it convenient for users to enter values in a natural format (time or equatorial coordi- nates). Coordinate Floating Point ----------------------------------------- 00:01 0.017 00:00:01 0.00028 01:00:00 1.0 01:00:00.00 1.0 01:30.7 1.5116 ----------------------------------------- Table 1.3: Coordinate and Floating Point Equivalents. The last example has only two fields with the last including a fraction. These two fields are then the largest and next largest fields, such as hours and minutes of time or degrees and minutes of arc. Note that there may be some problems in rounding, however. The predefined constants MAX_REAL and MAX_DOUBLE contain the host-dependent maximum permissible values for real and double constants, respectively. Character Constants A character constant consists of from one to four digits delimited at front and rear by the single quote ('), as opposed to the double quotes used to delimit string constants). A character constant is numerically equivalent to the corresponding decimal integer, and may be used wherever an integer constant would be used. On most systems, characters are represented in ASCII, therefore the character values are the ASCII values. Character Constant Decimal Value Interpretation ------------------------------------------------------------ `\007' 7 The integer 7, CTRL G, (BEL) `a' 97 The character a `\n' 10 The newline character `\\' 92 The character \ ----------------------------------------------------------- Table 1.4: Character Constants. The backslash character (\) is used to form escape sequences, which are special non-printed characters. SPP recognizes the following escape sequences: Escape Interpretation Decimal Control ASCII Value Sequence Mnemonic ---------------------------------------------------------------------- \b Backspace 8 CTRL H BS \f Form feed 12 CTRL L FF \n Newline 10 CTRL J LF \r Carriage return 13 CTRL M CR \t Horizontal tab 9 CTRL I HT -------------------------------------------------------------------- Table 1.5: Character Constant Escape Sequences. String Constants A string constant is a sequence of characters enclosed in double quotes ("), "image" for example. The double quote itself may be included in the string by escaping it with a backslash ("abc\"xyz"). All of the escape sequences given above are recognized. The backslash character itself must be escaped to be included in the string. A string constant may not span lines of text. For example, call strcpy ("This is a long character string with an embedded newline.", outstr, SZ_LINE) Would result in the error "Newline while processing string." However, you may include a newline in a string explicitly with the newline character, for example: call strcpy ("A string\nwith a newline.", outstr, SZ_LINE) Identifiers An identifier is the name used to refer to a variable or a procedure. Identifiers are constructed of an upper or lower case letter, followed by zero or more upper or lower case letters, digits, or the underscore character. Identifiers may be as long as desired, but only the first five characters and the last character are significant. Identifiers are used for variable names and procedure names, including built-in, intrinsic functions, as well as other language constructs. SPP maps all identifiers to a Fortran identifier that conforms to Fortran 66 standards. That is, they must be six character or fewer and may not include underscores. SPP performs the mapping by first removing underscores and taking up to the first five characters and the last character. If there is a conflict between two SPP identifiers that map to the same Fortran identifier, the last character of the mapped name is replaced with a digit in one of the names. It may be instructive to see the mappings. The mapped SPP and Fortran identifiers are listed as comments in the For- tran output by xc (using the -f option) at the end of the translated source. The definition of an identifier may be summarized using the following rules: [a-zA-Z][a-zA-Z_0-9]* See "Constants" on page 3 for an explanation of the syntax of this shorthand. The following example illustrates valid and invalid SPP identifi- ers: Figure 1.1: Identifier Syntax. Note that the last two map to the same Fortran variable. Therefore, if they were in the same source file, SPP would change the mapping of one to make them unique. The identifiers in Figure 1.2 are reserved. That is, do not use them as variable or procedure names. Note that not all of them are actually used at present. Figure 1.2: Reserved Identifiers. Fortran statements Fortran statements may be used in SPP source by preceding the state- ment with a percent character, %. The xc compiler then passes this state- ment through unchanged. Remember that Fortran does require specific positioning of the text on the line, unlike SPP. So you must include the nec- essary spaces between the % escape character and the beginning of the For- tran statement. For example: # Fortran follows, note # 6 spaces after % % INTEGER INTF Also keep in mind that while most SPP data types are the same as For- tran, character strings are not. See "Calling Fortran Subprograms" on page 38 and "Fortran Strings" on page 125 for more details. Data Types The subset preprocessor language supports a fairly wide range of data types. The actual mapping of an SPP data type into a Fortran data type depends on what the target compiler has to offer. SPP supports the usual fundamental data types: integer, floating point, complex, boolean, and character. Some of these have more than one subtype, varying by the size of each value. The actual size in bytes of a particular data type depends on the host system. IRAF maintains a structure containing these definitions, available to the applications programmer. Declaration Data Type Fortran Equivalent ----------------------------------------------------------------------- bool Boolean LOGICAL char Character short INTEGER short Short integer short INTEGER int Integer INTEGER long Long integer long INTEGER real Single precision floating REAL double Double precision floating DOUBLE PRECISION complex Single precision complex COMPLEX char[] String (character array) short INTEGER array pointer Pointer to memory INTEGER extern External function EXTERNAL ------------------------------------------------------------------------ Table 1.6: Data Types. Note that the size of the variable depends on its hardware implementa- tion which in turn depends on the combination of the Fortran compiler and the host operating system. For example, in VAX Fortran, short integers are implemented as INTEGER*2, including char and strings (char arrays), and long integers are implemented as INTEGER*4, which is the same size (four bytes) as INTEGER, by default. In addition to the seven primi- tive data types, the SPP language provides the abstract type pointer. The SPP language makes no distinction between pointers to different types of objects, unlike more strongly typed languages such as C. The extern type is also available to declare a function as a variable, as in the Fortran EXTERNAL statement. Integer SPP has three signed integer data types. There is no byte or unsigned integer data type. o short - The smallest integer type, usually two bytes. o int - A signed integer having the size of the fundamental host system word size, usually 32 bits or four bytes. This is equivalent to the Fortran INTEGER declaration. o long - The largest integer type, usually the same as int. Character The char data type belongs to the family of integer data types, i.e., a char variable or array behaves like an integer variable or array. The char and short data types are signed integers (i.e., they may take on negative values). String A string is an array of type char terminated by an end of string charac- ter (EOS). Strings may contain only character data (values 0 through 127 decimal), and must be delimited by EOS. A character string may be declared in either of two ways, depending on whether initialization is desired: char input_file[SZ_FNAME] string legal_codes "efgdox" char x[15] The preprocessor automatically adds one to the declared array size, to allow space for the EOS marker. However, the space used by the EOS marker is not considered part of the string. Thus, the char array x[15] will contain 16 elements, space for up to 15 characters, plus the EOS marker. It is probably a good idea to use an odd number for the string size decla- ration so that the resulting array contains an even number of elements. This permits alignment of strings on long word boundaries. Since char is implemented as Fortran INTEGER, whose size is usually four bytes, some- times referred to as a long word. Access to memory is usually more effi- cient if the variables are placed matching the addressable pieces Note that the string value need not fill the declared size. The EOS char- acter signals the end of the string. This is in contrast to Fortran strings, which do not include a terminator character and thus have an implicit size equal to the declared size and are padded with trailing blanks to the string length. Rather, SPP strings are practically identical to the concept of strings in C. Therefore, it is not possible to call a Fortran subroutine directly that expects a string in the calling sequence. However, there are procedures that convert between SPP and Fortran strings. (See "Calling Fortran Subpro- grams" on page 38). Note that in most procedures that take a string argu- ment, there is also an argument that specifies the maximum string size. See Chapter 2 for specific library procedures. Floating point Floating point variables may be single precision (real), double preci- sion (double), or complex (complex) and behave as the equivalent For- tran floating point variables. o real - A single precision value equivalent to the Fortran REAL data type. o double - A double precision floating point value, equivalent to the For- tran DOUBLE PRECISION data type. o complex - A pair of single precision floating point values equivalent to the Fortran COMPLEX data type. Boolean The only permissible values for a boolean variable are true and false. They are used as flag variables or used in test expressions of con- structs such as if and while. Note the distinction between boolean vari- ables and the integer constant parameters YES and NO; the latter are sometimes used as flags. Pointer Pointers are used to reference dynamically allocated memory. See "Memory Allocation - memio" on page 53 for a more complete discus- sion of dynamically allocated memory. More abstractly, pointers may be used to reference "structures," allocated memory with a particular arrange- ment of variables of differing data types and having a specific structure in memory. Declarations All SPP variables must be declared. This includes scalars and arrays, as well as functions. All declarations must precede the body of the procedure. That is, they must be between the procedure statement and the begin statement. Although the language does not require that procedure argu- ments be declared before local variables and functions, it is customary and a good practice. The syntax of a type declaration is the same for parame- ters, variables, and procedures. type_spec object [, object [,... ]] Here, type_spec may be any of the seven fundamental data types, a derived type such as pointer, or extern. A list of one or more data objects follows. An object may be a variable, array, or procedure. The dec- laration for each type of object has a unique syntax, as follows: procedure identifier() variable identifier array identifier[dimension_list] Note that all declaration statements must begin at the first character of the line. That is, there may be no white space between the beginning of the line and the beginning of the declaration. Scalar Variables Scalar variables are declared with the data type statements and the name of the variable. For example: int rows # Number of rows int cols # Number of columns real x, y # Coordinates bool verbose # Print verbose output? Customarily, most variables are described by an in-line comment. Arrays Arrays are declared similarly to scalars, with the array size appended to the variable name and enclosed in square brackets ([ and ]). The sizes of each dimension are separated by commas within the brackets. type_spec object[dim[,dim,... ]] Note that here the outer square brackets are required, the inner ones rep- resent optional multiple dimensions. Arrays may be up to seven dimen- sions and are one-indexed by default. That is, the first element is numbered one. Multiply dimensioned arrays are ordered such that the leftmost dimen- sions vary the fastest, as they are Fortran arrays. Arrays are referenced using the variable name with the element number(s) in square brackets ([]). As many dimensions must be used in the reference as in the declara- tion. It is not permitted to address an array outside its declared scope, but is not detected by the compiler. The following examples illustrate how to declare subscripted variables in SPP: Example 1.1: Declaring Subscripted Variables. The last example declares image to be 100 by 100 elements in size. The first element would be specified as image[1,1], followed by image[2,1], image[3,1], ... image[1,2], image[2,2], ... image[100,100]. The size of each dimension of an array may be spec- ified by any compile time constant expression, or by an integer parameter or parameters, if the array is a formal parameter to the procedure. If the array is declared as a formal procedure argument and the size of the highest (rightmost, or most slowly varying) dimension is unknown, the size of that dimension should be given as ARB (for arbitrary). The declared dimension- ality of an array passed as a formal parameter to a procedure may be less than or equal to the actual dimensionality of the array. For example, the fol- lowing example declares several arrays and uses some of them as argu- ments to functions. Example 1.2: Declaring Arrays and Using as Arguments to Functions. Note that the integer array intarr is declared as two-dimensional but referenced in the procedure as one-dimensional. The short array 3darray is declared as three-dimensional in both the calling and called procedure. However, in the called procedure, the last dimension is declared as ARB, while the others are declared with passed arguments. The lower dimensions must be declared explicitly in order for the function to compute the index of the elements. It is highly recommended to use defined (macro) constants instead of absolute constants to declare array sizes. This makes maintenance much easier in that the value is declared only once. If the constant is defined outside of a procedure, then any procedure in the same file may access the same constant, eliminating the need to pass a dimension to the functions. In addition, if the constants are defined in an include file they are available to procedures in more than one file. Functions External functions, whether supplied by the programmer or part of a library package must be declared in a manner similar to variables. This does not include intrinsic functions such as sin(), abs(), etc. (see "Intrinsic Functions" on page 36). Functions may be declared to be any valid SPP data type. For example, if the program includes a real valued function named myfunc, its declaration and invocation might appear as in Example 1.3. Example 1.3: Invoking External Functions. External Functions The extern data type declares a variable as a function. The name of the function may then be passed as an actual argument in a procedure call. In the formal procedure (dummy) arguments, the same argument must also be declared extern. Example 1.4: Declaring and Using the extern Data Type. Common Global common provides a means for sharing data between separately compiled procedures. The common statement is a declaration, and must be used only in the declarations section of a procedure. Each procedure refer- encing the same common must declare that common in the same way. common /identifier/ object [, object [, ... ]] For example, common /vfnxtn/ nextn, iraf, os, map To avoid the possibility of two procedures declaring the same common area differently in separate procedures, the common declaration should be placed in an include file (see "Include Files" on page 39). This permits considerably more reliable and easy maintenance, avoiding changes in one procedure without changing another. Initialization The data Statement Local variables, arrays, and character strings may be initialized at com- pile time with the data statement. Data in a global common may not be initialized at compile time. If initialization of data in a global common is required, it must be done at run time by an initialization procedure. The syntax of the data statement is defined identically to the standard Fortran 77 DATA statement. Some simple examples follow. real x, y[2] char ch[2] data x/0/, y/1.0,2.0/, ch/'a','b',EOS/ Any data statements must follow all declarations. Note that variables initialized by data are not guaranteed to have that value except the first time the task is executed from the cl. IRAF tasks executed from the cl may be cached or stored in the process cache. That is, they are not restarted from the main procedure except the first time they are executed and after the process cache is flushed (using the cl task flprcache). Therefore, a variable modified in a task procedure will not have the initialized value the next time the task is executed, but will have the modified value. It is always safer to initialize variables with macro symbolic constant define state- ments or explicit assignment statements. The string Statement Character strings may be declared and initialized with the string statement. This consists of the keyword string followed by the identi- fier name, followed by the initialization value enclosed in double quotes. Not that there is no explicit string size. A char array is implicitly declared the size of the initialization string. string errmsg "Could not open input" Macro Definitions An SPP macro assigns a symbol or identifier to arbitrary text, implementing string substitution. This enables any piece of code to be hidden by using its defined symbol rather than the text itself. Upon precompilation, the macro symbol is replaced by its assigned text. The primary uses of macros are to define symbolic constants such as mathematical constants, whose value will not change at run time, implementing in-line or statement functions, and for creating data structures. Macro definitions allow hiding certain information and can do much to enhance the ease of modifying and maintaining a program. By convention, the names of macros are upper case, to distinguish the names from variables, functions, and other identifiers and to make it clear that a macro is being used. Macros are created by using the define command. If the macro is defined after the procedure statement, it must be defined before the begin statement, and only that procedure may use it. That is, its scope is within a single procedure. If a macro is defined before the procedure statement, it is available to any procedure in the source file. Macros that are shared by several procedures should be defined in an include file, particularly if the source is in different files (see "Include Files" on page 39). Macros may or may not have arguments. An argument is declared in a macro definition by using a dollar character ($) and a numeral indicating the argument number. In the macro invocation, arguments are passed in parentheses, (). Multiple arguments are separated by commas. Macros without arguments are used primarily to turn explicit constants into symbolic parameters. Examples are shown throughout this text. Macros with arguments are used as statement functions and data structure elements. Macros incorporating expressions should be enclosed in parentheses to ensure that the expression is executed with the intended precedence. Macro definitions may not include string constants. You may use the string statement to declare string constants. All other types of constants, constant expressions, array and procedure references, are allowed, however. The domain of definition of a macro extends from the line following the macro, to the end of the file (except for include files). Macros may be recursive and may be redefined, resulting in no mention by the compiler. Macro definitions are frequently shared among procedures in several source files by putting them in an include file. This is another source file, but has the extension .h and is included in any source by using the include statement (see "Include Files" on page 39). There are many examples of macro definitions and structures using them in the IRAF sources, both the system code as well as the applications. Look in the lib$ and hlib$ directories for the include files for the IRAF system. In addition, each applications package usually contains one or more header include files containing numerous examples. Symbolic Constants Constants may be declared as variables, initialized with an assignment statement or by using a data statement. Alternately, a symbolic constant may be declared as a macro, using a define statement. Each time the macro is used in the code, its name is replaced by the text specified in the define statement when the code is compiled. There is no data storage allocated nor an assignment executed at run time. It becomes easy to change the values of constants by changing it once in the define state- ment rather than throughout the code. The meaning of the code frequently becomes clearer by referring to constants by name (PI) rather than by value (3.14159). There are many constants defined automatically as well as several include files available defining many frequently used con- stants. See Appendix A for a description of these. The following example illustrates the use of macros as symbolic constants: Example 1.5: Using Symbolic Constants. Data Structures A data structure allows a set of variables to be treated as a group. These may include variables of different data types, arrays, strings, pointers, etc. See "Data Structures" on page 58 for more details and additional examples. Example 1.6: Using Data Structures. In this example the macros define a simple structure that permits a dif- ferent way of using an array. Instead of accessing the array by numeric ele- ment numbers, it permits a different name to be defined for each array element that may contain inherently different entities. The array coeff[] is redefined as a simple structure containing the fields I_TYPE, I_NPIX, ..., and I_COEFF. Defining a structure enhances the readability of a pro- gram by permitting reference to the fields of the structure by name, rather than the array element (coeff[2]), and furthermore makes it easier to modify the structure. The same code could be written without using mac- ros, referencing coeff as elements of the array or declaring the equivalent elements as separate variables. Note that parentheses are used to refer to elements of the structure, as opposed to square brackets, which refer to array elements. The equivalent implementation without using macros would use an array and reference the elements of the array by their number. This simple example is straightforward. However, for a complicated exam- ple, it is usually much clearer to refer to disparate entities by name rather than by an array element. Example 1.7: Implementing Example 1.6 with Array Elements. The same result may be accomplished by using a common block, as is shown in the next example. Example 1.8: Implementing Example 1.6 with Common Blocks. Of course, any other procedure using the variables in the common block would have to declare it identically. If you do use common, put it and the associated variable declarations in an include file so there is only one place the declarations needs to be modified. It is possible to define a structure containing any data type. The types int, real, bool, and pointer are guaranteed to be the same length, a single word in memory. A common method of declaring a structure is to use dynamically allocated memory, referring to the structure elements using the Mem[] syntax (see "Memory Allocation - memio" on page 53). In this case, you need not explicitly specify a different offset for each data type. For types which may differ in size, however, you must be able to refer to the correct offset and size of a particular structure element. This applies to short, long, double, complex, and particularly to char and elements treated as arrays. Note that these should be aligned on long word boundaries. The convention is to declare the variables in the order of longest first to shortest last, with character strings declared last. There are system defined macros for aiding in the conversion of pointers to these data types: Macro Converts to Type ----------------------------- P2X complex P2D double P2L long P2S short P2C char ---------------------------- Table 1.7: System Macros for Converting Pointers. The P2T macros permit you to address the next structure element without worrying too much about the word size. These are defined in hlib$iraf.h since they depend on the host architecture. The following example declares a structure containing several different data types and some constants. The difference between this and the previous example is that the memory containing the structure is allocated dynamically instead of using a statically allocated array. This additionally permits multiple instances of the structure to be defined. This is the way many packages handle internal parameters. For example, each time an image is opened using immap(), a structure is allocated containing parameters pertaining to the image. Multiple images may be opened, each having associated parameters organized using the same structure. Example 1.9: Structure Elements Defined in myincl.h. Note that even though the P2T macros take care of the offsets into the Mem[] arrays, you still need to keep in mind the size of each structure ele- ment to find the offset to the next one. Thus, DVAL is offset by two from XVAL since a complex is two words. However, adjacent fields have con- secutive offsets ($1, $1+1, ...) if they occupy a single word. Note also the use of a second argument in IARRAY to specify the array element, the position within the chunk of the allocated memory. The above structure definition would be used by first allocating memory for the structure and accessing each field using the returned structure pointer, as shown in Example 1.10. Example 1.10: Allocating and Using Structures by Pointer. Another way to define arrays or character strings in a macro structure is to store only a pointer to dynamically allocated memory in a field of the structure. In this case, the memory for the array has to be allocated explic- itly in the code in addition to the memory for the structure. Example 1.11: Defining Arrays in a Structure with Dynamically Allocated Memory.. Macro Functions Macros with arguments may also be used to define in-line functions. For example, here are a couple of definitions of character classes from the sys- tem include lib$ctype.h: Example 1.12: Macro Definitions. These are used in the following: Example 1.13: Using Macro Functions. Control Flow SPP provides a full set of control flow constructs found in most modern languages such as conditional execution and repetition. Some of these have already appeared in examples. An SPP control flow construct executes a statement either conditionally or repetitively. The statement to be executed may be a simple one line statement, a compound statement enclosed in curly brackets or braces, or the null statement (; on a line by itself). An assortment of repetitive constructs are provided for convenience. The sim- plest constructs are while, which tests at the top of the loop, and repeat until, which tests at the bottom. The do construct is convenient for sim- ple sequential operations on arrays. The most general repetitive construct is the for statement. o Conditional Constructs - if - if...else - switch - case o Repetitive constructs - do - for - repeat...until - while o Branching - break - next - goto - return Two statements are provided to interrupt the flow of control through one of the repetitive constructs. The break statement causes an immediate exit from the loop, by jumping to the statement following the loop. The next statement shifts control to the next iteration of a loop. If break and next are embedded in a conditional construct which is in turn embedded in a repetitive construct, it is the outer repetitive construct which will deter- mine the point to which control is shifted. Note that formatting in the form of indentation and white space is not mandatory, but makes the code more readable and therefore easier to maintain. if...else The if and if else constructs are shown below. The expr part may be any boolean expression (see "Expressions" on page 31). The statement part may be a simple statement, compound statement enclosed in braces, or the null statement. The statement(s) will be executed if the expression resolves to true. Otherwise, it will fall through to the next block consist- ing of an else or else if. if (expr) statement [else if (expr) statement] [else (expr) statement] The control flow constructs may be nested indefinitely. There may be an if clause without an else or else if. There is no end if. A simple example of an if ... else ... else if is: Example 1.14: Using if..else. switch...case The switch case construct evaluates an integer expression once, then branches to the matching case. Each case must be a unique integer constant. The maximum number of cases is limited only by table space within the compiler. A case may consist of a single integer constant, or a list of integer constants, separated by commas and terminated by the colon character:. The special case default, if included, is selected if the switch value does not match any of the other cases. If the switch value does not match any case, and there is no default case, control passes to the state- ment following the body of the switch statement. In every case, control passes to the statement following the switch. A break statement is not needed after each case (in contrast to the switch ... case statement in C). Each case of the switch statement may consist of an arbitrary num- ber of statements, which do not have to be enclosed in braces. The body of the switch statement, however, must be enclosed in braces as shown below. switch (expr) { case list: statements [case list: statements] . . [default: statements] } For example: Example 1.15: Using switch and case. The switch construct will execute most efficiently if the cases form a monotonically increasing sequence without large gaps between the cases (i.e., case 1, case 2, case 3, etc.). Ideally, the cases should be defined parameters or character constants, rather than explicit numbers. while The while statement repetitively executes a statement or a block of statements as long as the specified condition expression is true. The condi- tion is tested at the beginning of the loop, so it is possible for the statement not to be executed at all. while (expr) statement repeat...until The repeat construct repetitively executes a statement or a block of statements. The simpler form simply repeats forever. The statement block might include a break statement to terminate the loop. The repeat...until form executes the statement as long as the logical expression in the until statement is false. The condition is tested at the end of the loop, so the statement will always be executed at least once. repeat repeat statement statement until (expr) for The for construct consists of an initialization part, a test part, a loop control part, and a statement to be executed. The initialization part consists of a statement which is executed once before entering the loop. The test part is a boolean expression, which is tested before each iteration of the loop. The loop control statement is executed after the last statement in the body of the for, before branching to the test at the beginning of the loop. When used in a for statement, next causes a branch to the loop control statement. The for construct is very general, because of the lack of restric- tions on the type of initialization and loop control statements chosen. Any or all of the three parts of the for may be omitted, but the semicolon delimiters must be present. Only one statement is permitted for each con- trol section, unlike C. for (init; test; control) statement For example: Example 1.16: Using for. This for statement searches the string str backwards until the charac- ter 'z' is encountered, or until the beginning of the string is reached. Note the use of the null statement (;) in the body of the for, since everything has already been done in the for itself. The strlen procedure is shown in a later example. Note that the above example may result in an error if the string is null, in which case ip = 0 and the test str[ip] != 'z' will try and access a character before the beginning of the string. do The do construct is a special case of the for construct. It is ideal for simple array operations, and since it is implemented with the Fortran DO statement, its use should result in particularly efficient code. do lcp = initial, final [, step] statement General expressions are permitted as loop control in the do statement but their result must be integers. The loop may run forward or backward, with any step size. Note that to operate backward, the step must be nega- tive, and the initial value should be larger than the final value. The body of the do will not be executed if the initial value of the loop control parameter satisfies the termination condition. For example: Example 1.17: Using do. break The break statement causes an immediate exit from a loop by jumping to the statement following the loop. next The next statement immediately shifts control to the next iteration of a loop. return The return statement assigns a value to a function or returns control to the calling procedure. This value is passed back to the calling procedure as the function value. The returned value is an expression which resolves to the declared data type of the function. For example: Example 1.18: Using the return Statement. goto The goto statement unconditionally branches to another point in a pro- cedure. The target statement is specified by a label, which is an integer con- stant on the beginning of a line, preceding an executable (unnumbered) statement. For example: Example 1.19: Using the goto Statement. Alternately, the label may be assigned a symbolic value using the define statement. This permits more mnemonic labels. Example 1.20: Using Symbolic Values with goto Statements. The underscore at the end of the label (termin_ in the example above) is not required. but is a recommended convention to permit the labels to stand out as distinct from other identifiers. Expressions An expression may be a numeric constant, a string constant, an array reference, a call to a typed (function) procedure, or any combination of the above elements, in combination with one or more unary or binary opera- tors. Every expression is characterized by a data type and a value. The data type is fixed at compile time, but the value may be either fixed at compile time, or calculated at run time. Parentheses may be used to force the com- piler to evaluate the parts of an expression in a certain order. In the absence of parenthesis, the precedence of an operator determines the order of evalu- ation of an expression. The highest precedence operators are evaluated first. The precedence of the SPP operators is defined by the order in which the operators appear in the table under heading "Data Types" on page 8. Procedure call has the highest precedence. The argument list in a procedure or array reference consists of a list of general expressions separated by commas. If an expression contains calls to two or more procedures, the order in which the procedures are evaluated is undefined. Operators SPP supports the usual arithmetic operators which take operands of any numeric data type. In addition there are the usual comparison operators which take operands of any data type with the data type of the result always boolean. Finally, there are boolean operators taking boolean operands and also resulting in a boolean. Operator Operands Result Operation ------------------------------------------------------------------------ + Numeric Numeric Add - Numeric Numeric Subtract, negate * Numeric Numeric Multiply / Numeric Numeric Divide ** Numeric Numeric Power < Numeric Boolean Less than <= Numeric Boolean Less than or equal to > Numeric Boolean Greater than >= Numeric Boolean Greater than or equal to == Numeric Boolean Equal to != Numeric Boolean Not equal to ! Boolean Boolean Not || Boolean Boolean Or && Boolean Boolean And | Reserved operator & Reserved operator ------------------------------------------------------------------------ Table 1.8: Arithmetic and Boolean Operators. Minus (-) may be a binary operator (have two arguments) or unary operator (have one argument) operator. As a binary operator it represents subtraction and as a unary operator it represents negation. The boolean not (!) is always a unary operator. Mixed Mode Expressions Binary operators combine two expressions into a single expression. If the two input expressions are of different data types, the expression is said to be a mixed mode expression. The data type of a mixed mode expression is defined by the order in which the types of the two input expressions appear in the table under "Data Types" on page 8. The data types are listed in the table in order of increasing precedence. Thus, the data type which appears furthest down in this table will be the data type of the combined expression. For example, an int plus a real produces a real. Mixed mode expressions involving bool are illegal. While char expressions are permitted, there are no string operators or expressions since there is no fun- damental string data type. Type Coercion Type coercion refers to the conversion of an object from one data type to another. Such conversions may involve loss of information, and hence are not always reversible. Type coercion occurs automatically in mixed mode expressions, and in assignment statements. Type coercion is not per- mitted between booleans and the other data types. Data Type Contains ---------------------------------------------------------------- aimag Imaginary part of complex complex Complex double Double precision floating point int Integer real Single precision floating point -------------------------------------------------------------- Table 1.9: Data Type Precedence. The data type of an expression may be coerced by a call to an intrinsic function. The names of these intrinsic functions are the same as the names of the data types. Thus, int(x), where x is of type real, coerces x to type int, while double(x) produces a double precision result. The Assignment Statement The assignment statement assigns the value of the general expression on the right side to the variable or array element given on the left side. Auto- matic type coercion will occur during the assignment if necessary (and legal). Multiple assignments may not be made in a single assignment state- ment. That is, an assignment statement may have only one equal sign. However, a line may contain more than one statement, separated by semi- colons (;). Example 1.21: Assignment Expressions. Procedures Procedures are the basic units of SPP programs. They also include func- tions, procedures that return a value. The form of a procedure declara- tion is shown below. [data_type] procedure proc_name ([p1 [, p2 [,... ]]]) [declarations for procedure arguments] [declarations for local variables] [declarations for functions] [initialization] begin [executable statements] end The data_type field must be included if the procedure returns a value. The begin keyword separates the declarations section from the execut- able body of the procedure, and is required. The end keyword must follow the last executable statement. Note that the procedure statement and the declaration statements must begin in the first character on the line. All parameters, variables, and typed procedures must be declared. The SPP language does not permit implicit typing of parameters, variables, or procedures, unlike Fortran. By convention, declarations of procedure argu- ments precede local declarations. It is also good practice to use in-line comments to describe the declarations. If a procedure has formal parameters, they should agree in both number and type in the procedure declaration and when the procedure is called. In particular, beware of short or char parameters in argument lists. An int may be passed as a parameter to a procedure expecting a short inte- ger on some machines, but this usage is not portable, and is not detected by the compiler. The compiler does not verify that a procedure is declared and used consistently. If a procedure returns a value it is known as a function and the calling program must declare the procedure in a type declaration, and must refer- ence the procedure in an expression. The function procedure must contain a return which assigns the value to pass back to the caller as the function value. A function procedure may return a numerical value, but may not return an array or string. If a procedure does not return a value, the calling program may refer- ence the procedure only in a call statement. However, the return state- ment may be used to end the procedure at any point and return control to the calling procedure. begin...end The executable statements in a procedure must be surrounded by begin and end statements. All declarations must be placed between the procedure statement and the begin. {...} Braces ({ and }) may be used to bracket explicitly groups of statements intended to be treated as a single statement, for example, in if, for, or while constructs. Arguments Formal or dummy arguments and actual arguments must match in num- ber and type. That is, the declarations in the calling and called procedure must be the same for all of the arguments. entry Statement Procedures with multiple entry points are permitted in SPP because they provide an alternative to global common when several procedures must access the same data. The multiple entry point mechanism is similar to block structuring. The multiple entry point construct is only useful for small problems. If the problem grows too large, an enormous procedure with many entry points may result, which is difficult to maintain. The form of a procedure with multiple entry points is shown below. Either all entry points should be untyped, as in the example, or all entry points should return values of the same type. Control should only flow forward. Each entry point should be terminated by a return statement, or by a goto to a common section of code which all entry points share. The shared section of code should be terminated by a single return which all entry points share. Example 1.22: Using the entry Statement. Intrinsic Functions Any function written as part of the task must be declared. However, SPP includes several intrinsic functions that need not be declared. The intrinsic functions are generic functions, meaning that the same function name may be used regardless of the data type of the arguments. The arguments to trig- onometric functions are assumed to be in radians, as in Fortran. Function Description ------------------------------------------------------------------------ abs(a) Absolute value |x| acos(a) Arccosine, returns angle in radians cos-1 a asin(a) Arcsine, returns angle in radians sin-1 a atan(a) Arctangent, returns angle in radians tan-1 a atan2(a, b) Arctangent, returns angle in radians tan-1 a char(a) Convert to character complex(a,b) Complex from real and imaginary parts conjg(a) Complex conjugate cos(a) Cosine, argument in radians cosh(a) Hyperbolic cosine, argument in radians double(a) Convert to double precision exp(a) Exponential ea int(a) Convert to integer, truncate log(a) Natural logarithm log10(a) Common logarithm long(a) Convert to long integer max(a, b) Maximm min(a, b) Minimum mod(a, b) Modulus or remainder a - [a/b] nint(a) Nearest integer real(a) Convert to single precision short(a) Convert to short integer sin(a) Sine, argument in radians sinh(a) Hyperbolic sine, argument in radians sqrt(a) Square root tan(a) Tangent, argument in radians tanh(a) Hyperbolic tangent, argument in radians ---------------------------------------------------------------------- Table 1.10: Intrinsic Functions Note that the names of the type coercion functions (char, short, int, real, etc.) are the same as the names of the data types in declaration statements. The functions log10, tan, and the hyperbolic functions may not be called with complex arguments. As in Fortran, the arguments to trig- onometric functions must be in radians. Calling Fortran Subprograms Since SPP is preprocessed into Fortran, in most cases, it is quite straightforward to call an existing Fortran subroutine from an SPP proce- dure. The most important caution is in using character strings. SPP strings are not the same as Fortran strings. SPP strings are implemented as arrays of integers. However, there are procedures available to transform between the two: f77pak() converts an SPP string to a Fortran string, and f77upk() converts a Fortran string to an SPP string. Note that you must declare the Fortran string in the SPP procedure with a Fortan statement. This is possible with the % escape as the first character on a line. This indi- cates to the xc compiler that the following statement should not be pro- cessed but copied directly to the Fortran code. See also "Expressions" on page 31 and "Fortran Strings" on page 125. Program Structure An SPP source file may contain any number of procedure declara- tions, zero or one task statements, any number of define or include statements, and any number of help text segments. By convention, global definitions and include file references should appear at the beginning of the file, followed by the task statement, if any, and the procedure declarations. Example 1.23: Program Structure. Include Files Include files permit an external file to be inserted into SPP code. They are referenced at the beginning of a file to include global definitions that must be shared among separately compiled files, and within procedures to reference common block definitions. Two forms allow for system-defined includes or user-defined includes. The include statement is effectively replaced by the contents of the named file. Includes may be nested at least five deep. The most common uses for include files are macro definitions and structure declarations to be shared by several source files comprising a task. The name of the file to be included must be delimited by either angle brackets () or quotation marks ("file"). The first form is used to ref- erence the IRAF system include files. This includes external packages such as STSDAS if these are installed. The second, more general, form may be used to include any file. The file name may include an absolute or relative directory path. However, the safest and most portable method of accessing include files in SPP source is to have the source and include files in the same directory. You then need only refer to the file itself in the include statement without any absolute or relative directory information. Example 1.24: Using Include Files. Help Text Documentation may be embedded in an SPP source file either by com- menting out the lines of text using the # character or by enclosing the lines of text within .help and .endhelp directives. If there are only a few lines of text, it is probably most convenient to comment them out. Large blocks of text should be enclosed by the help directives, making the text easier to edit, and accessible to the on-line documentation and text process- ing tools. Figure 1.3: Commenting out Documentation Blocks. The preprocessor ignores comments, and everything between .help and .endhelp directives. The directives must occur at the beginning of a line to be recognized. In both cases, the preprocessor ignores the remainder of the line. The arguments to .help are used by the help cl utility, but are ignored by SPP. Help text may be typed in as it is to appear on the ter- minal or printer, or it may contain text processing directives. See the cl lroff documentation for a description of the IRAF text processing direc- tives. Manual pages (help text) for tasks may be stored either directly in the source file as help text segments, or in separate files. If separate source and help files are used, both files conventionally have the same root name, and the help text file should have the extension .hlp. The task Statement The task statement is used to make an IRAF task. A file need not con- tain a task statement, and may not contain more than a single task state- ment. Files without task statements are separately compiled to produce object modules, which may subsequently be linked together to make a task, or which may be installed in a library. An executable program requires a task statement, although it may be in a file by itself. This is then linked with the other procedures making up the task. task ltask1, ltask2, ltask3=proc3 If the task name is identical to the main procedure of the task, then only the task name needs to be in the task statement. The main procedure may have a different name, however. In this case, the procedure name must be specified in the task statement with an assignment. Example 1.25: The task statement. Generic Preprocessor There are many cases in which the same algorithm may need to be implemented for several different data types. The generic preprocessor, in addition to SPP converts a generic procedure into a set of procedures spe- cific to particular data types. We mention this briefly here and refer to a more detailed discussion in "Generic Preprocessor" on page 167 and help generic in the IRAF cl, which describe all of the preprocessor directives and the command used to process generic code. Many useful examples of generic procedures exist in IRAF, particularly in the vops package, a library of generic procedures dealing with vector operations implemented for the SPP data types. See "Vector (Array) Operators - vops" on page 103 for a description of this package. To indicate the flavor of this facility, here is an example of generic code from the vops package: Example 1.26: Generic Code from vops Package. The generic preprocessor will replace the $t suffix on the procedure name by the single character initial of the data type (s, i, etc.). The prepro- cessor directive PIXEL is replaced by the appropriate data type declaration (short, int, etc.).