The Syntax of A+ Summary 1. Introduction 2. Names and Symbols 2a. Primitive Functions 2b. User Names 2c. System Names 2d. System Commands 2e. Comments 3. Infix Notation and Ambivalence 4. Syntactic Classes 4a. Numeric Constants 4b. Character Constants 4c. Symbol Constants 4d. The Null 4e. Variables 4f. Functions 4g. Operators and Derived Functions 5. Defined Functions 6. Dependencies 7. Bracket Indexing 8. Strands 9. Precedence Rules 10. Right-to-Left Order of Execution 11. Control Statements 11a. Case Statement 11b. Do Statement 11c. If Statement 11d. If-Else Statement 11e. While Statement 12. Execution Stack References 13. Well-Formed Expressions 1. Introduction The purpose of this tutorial is to describe the syntax of A+ through a series of examples, rather than in a formal way. Some commonly understood terms are used without being formally defined. In particular, the phrase A+ expression, or simply expression, is taken to have the same general meaning it does in mathematics, namely, a well-formed sentence that produces a value. A brief discussion of well-formed expressions is presented at the end, after all the rules for the components of expressions have been presented. Not all aspects of A+ syntax are discussed here; see the chapter on syntax in the A+ Reference Manual, and the Assignment tutorial. Although this tutorial is primarily concerned with syntax, examples require some knowledge of their meaning. Each example will be fully explained, but comprehensive treatments of topics other than syntax are left to the other language tutorials. The tutorial is made up of textual descriptions and A+ examples. You should set up your Emacs environment to have two visible buffers, one holding the tutorial and the other an A+ session. If you are currently reading this in Emacs, simply press F4. To bring individual expressions from the tutorial into the A+ session, place the cursor on the expression and press F2; for function definitions place the cursor anywhere in the definition and press F3. It is assumed that the expressions and functions are brought into the A+ session when you first encounter them, unless there are explicit directions to the contrary. If you need more help on running emacs and A+, see Getting Started. If you want to try your hand at writing your own A+ expressions, see the keyboard layout diagrams in Appendix B of the A+ Language Manual. If you need more help on running Emacs and A+, see the Getting Started tutorial. 2. Names and Symbols One of the most basic things to know is how things are named. There are no exercises in this section, just information you will need later. 2a. Primitive Functions A+ uses a mathematical symbol set to denote the functions that are native to the language, which are called primitive functions. This symbol set, which is the APL character set, consists of common mathematical symbols such as + and «, commonly used punctuation symbols, and specialized symbols such as Ù and Õ. In some cases it takes more than one symbol to represent a primitive function, as in +/, but the meaning can be deduced from the individual symbols. 2b. User Names User names fall into two categories, unqualified and qualified. A valid, unqualified name is made up of alphanumeric (alphabetic or numeric) characters and underbars (_). The first character must be alphabetic. For example, a, a1c, and a_1c are valid unqualified names, but 3xy and _xy are not. A valid qualified user name is either an unqualified user name preceded by a dot (.), or a pair of unqualified user names separated by a dot. In either case there are no intervening blanks. For example, .xw1 and w_2.r2_a are valid qualified user names. 2c. System Names System names are unqualified names preceded by an underbar, with no intervening spaces. For example, _argv is a valid system name. The use of system names is reserved by A+. 2d. System Commands System commands begin with a dollar sign, followed immediately by an unqualified name, which is the name of the command. The name is followed by a space, and then possibly by a sequence of characters whose meaning is specific to the command. For example $fns is a valid system command. 2e. Comments Comments can appear on a line by themselves or to the right of any expression. They are indicated by the ã symbol, and everything to the right of this symbol is the comment. For example: 2«3 ã This is the A+ notation for multiplication. 3. Infix Notation and Ambivalence A+ is a mathematical notation, and as such uses infix notation for functions with two arguments. That is, the symbol or user name for a function with two arguments appears between them. For example, a+b denotes addition, a-b subtraction, a«b multiplication, and aßb division. In mathematics, the symbol - can also be used with one argument, as in -b, in which case it denotes negation. This is true in A+ as well. Because the symbol denotes two functions, one with one argument and the other with two, it is called ambivalent. A+ has extended the idea of ambivalence to most of its primitive functions. For example, just as -b denotes the negative of b, ßb denotes the reciprocal of b. User defined functions cannot be ambivalent. Functions with one argument are called monadic, and functions with two arguments are called dyadic. For a primitive function symbol, one often refers to its monadic use or dyadic use. Ex 1. Execute each of the following using F2. After each one is executed, you will see the result displayed immediately below. 5ß2 ß2 A more interesting example, perhaps, is the primitive function denoted by the down arrow Õ(meta-u on the keyboard). The dyadic form is called Drop because it has the effect of dropping a specified number of elements from a list. For example, if x is the name of a variable containing the list of five characters a, b, c, d, and e, then 2Õx drops the first two characters from the list, leaving a list of the three characters c, d, and e. The monadic form of Õ is called Print because its effect is to display its argument in the A+ session log. For example, execute the following: 5ß(Õ2) and you will see 2 displayed, followed by the result of the expression. The print primitive, like all primitives, produces a result, and that result is used in further execution. Unlike most primitives, it also has a side effect, which is the display of its argument in the session log. Ex 2. What do you think the result of Õx is? Describe it in terms of x. 4. Syntactic Classes 4a. Numeric Constants Individual numbers can be expressed in the usual integer, decimal, and exponential formats, with one exception: negative number constants begin with a "high minus" sign (¢) instead of the more conventional minus sign (-). Negative exponents in the exponential format are denoted by the conventional minus sign. It is also possible to express a list of numbers as a constant, simply by separating the individual numbers by one or more blank spaces. For example: 1.23 ¢7 45 3e-5 is a numeric constant with four numbers: 1.23, negative 7, 45, and 0.00003. Ex 2. Most likely you are familiar with numeric formats, and by the end of this tutorial you should be experimenting with expressions of your own creation, so we will use numeric constants to illustrate how to deal with ill-formed expressions. The high minus sign is not used for exponents. Execute the following to see a parse error message: 1e¢2 Ex 3. Constants can have more than one element, as illustrated above. As a single number, 1.2.3 is ill-formed, but A+ parses this sequence as if it were a list of numbers. Execute the following and explain what you see: 1.2.3 Ex 4. Constants can be put inside parentheses, which does not effect their value, but gives us a way to illustrate syntax errors. Execute the following: 2.109) You will see a syntax error message saying that the right parenthesis has no matching left. Now execute (2.109 You will now see a *. The display of a * by the A+ in circumstances like these indicates suspended execution. The reason that this expression results in suspended execution instead of a syntax error is that it is viewed by the A+ process as incomplete. More characters could have been appended on its right side to form a complete expression, which is not true of the first expression, 2.109). Select the A+ buffer, and the keyboard cursor should then be positioned to the right of the *. Enter the closing right parenthesis and press the Return key. You will see 2.109 displayed, just as if you had entered the syntactically correct expression (2.109) all on one line. Select the tutorial buffer to continue. The A+ language processor accepts expressions that occupy more than one line. However, expressions cannot be broken in the middle of names, or numeric constants, or primitive functions that require more than one character, and their must be a reason for A+ to expect a continuation, such as open punctuation. Ex 6. This exercise is a variation of the last one. Execute the expression: (2.109 Once again you will see a *. Select the A+ buffer and enter ( instead of ). Press the Return key. You will now see two *'s. There are two points to be made here. First, the number of *'s indicates the level of suspension. It now takes two actions to clear the suspended execution, e.g. two closing parentheses. Second, suppose entering the second ( was a mistake, and you simply want to clean things up and start over. To do this you should enter a right pointing arrow (meta-] on the keyboard) next to the two *'s, and press the Return key. Do that, and then select the tutorial buffer to continue. 4b. Character Constants A character constant is expressed as a list of characters surrounded by a pair of single quote marks or a pair of double quote marks. In order to include the surrounding quote mark in the list of characters, it must be doubled. For example, both 'abc''d' and "abc'd" are constant expressions for the list of characters abc'd. Ex 5. Execute each of the following to see how ' and " are handled: 'Aed"ss' "Aed'ss" The following will cause errors: 'Aed'ss' 'Aed' 'ss' Explain the error reports. Clear any suspended executions, and return to the tutorial buffer. Ex 6. What do you think happens if you break an A+ expression in the middle of a character constant?. Execute the expression: 'abcd and you will see the suspension indicator. To the right of it enter: * 2345' The result will now be displayed. Explain what you see. For that purpose, note that the symbol # applied monadically to a list of characters yields the number of characters in the list. For example: #'sdTvw' 5 Repeat the above example using # as follows: #'abcd * 2345' 9 Explain the result. 4c. Symbol Constants A symbol is a backquote (`) followed immediately by a character string made up of the alphabetic characters, underscores (_), and dots (.). A symbol constant can be thought of as a character-based counterpart to numeric constants. Just as 1 2.34 12e3 is a list of three numbers, `a.s `12 `w_3 is a list of three symbols. 4d. The Null The Null is a special constant formed as follows: (). It is neither numeric nor character, but has a special type reserved for it alone. 4e. Variables Variables are named data objects. They receive their values through assignment, or specification, which is denoted by the left-pointing arrow (û). For example, the expression abcû1 2 3 assigns the three-element list consisting of 1, 2, and 3 to the variable named abc. Any valid user name can serve as a variable name. For more on assignment, see the Assignment tutorial. 4f. Functions Functions take zero or more arguments and return results. A sequence of characters that constitutes a valid reference to a function will be called a function call expression. That is, a function call expression includes a function symbol or name together with all its arguments and all necessary punctuation. In general, the arguments of a function are data objects, which may appear in function call expressions as variable names, constants, or expressions that require evaluation. In addition, for the various forms of function call expressions using braces, arguments can also be functions. A function with no parameters - which must be a user defined function - is said to be niladic. The valid function call expression for a niladic function f is f{}. Functions with one argument can be either primitive or user defined. The valid function call expressions for a function f with one argument a are f a and f{a}. In the form f a, the space is required if, when it is omitted, the result would be a valid name, as plus 2.3. Functions with two arguments can also be either primitive or user defined. The valid function call expressions for a function g with two arguments a and b are a g b and g{a;b}. a is called the left argument and b is called the right argument. The rule for required spaces in the dyadic form a g b is the same as for the monadic form f a. Functions with more than two arguments must be user defined. The valid function call expression for a function of more than two arguments a, b, ..., c is f{a;b;É;c}. For function call expressions that use braces and contain at least two arguments, any of the positions between neighboring semicolons, or between the left brace and the first semicolon, or between the last semicolon and the right brace, can be left blank. For example, each of the following is a valid function call expression: f{a;}, f{;b}, f{;a;b}, f{;;b}, etc. However, if f is monadic then f{} is not valid, because f{} is a niladic function call expression. When an argument position is legitimately left blank, A+ assumes that the argument is the Null. The number of arguments of a function is called its valence. The valence of a user defined function is fixed by the form of its definition. Ex 7. Use F2 to define the following dyadic function: a f b:a-b and then evaluate the following function call expressions: 2 f 5 f{2;5} (Function definitions are discussed in Defined Functions.) Explain the meaning of -{2;5} and then execute it for verification. Ex 8. Define the following function: g{a;b;c}:(a;b;c) As will be explained later, the result of this function is a data aggregate with three elements, which are the arguments to the function. For example, execute: g{1;2;3} and you will see displayed three lines, with < 1, < 2, and < 3. The symbol < indicates that the data being displayed is part of an aggregate. Now execute: g{;2;3} g{1;;3} and you will see that wherever an argument is omitted, the corresponding output line is < followed by blanks. This indicates that the omitted arguments are taken to be the Null (however, the same display line could represent other things as well, such as a blank list of characters.) 4g. Operators and Derived Functions There are two formal, primitive operators in A+, known as Rank and Each. By a formal operator we mean an operator in the mathematical sense, i.e. a function that takes a function as an operand, or produces a function as a result, or both. The resulting function is called a derived function. The Each operator is denoted by the dieresis, ¡. For a given function f, the function derived from the Each operator is denoted by f¡. The function f can be either monadic or dyadic, in which case so is f¡. The Rank operator is denoted by the at symbol, @. Unlike the Each operator, the Rank operator has both a function argument and a data argument. For a given function f and data value a, the function derived from the Rank operator is denoted by f@a. f can be either monadic or dyadic, in which case so is f@a. Ex 9. The Rank and Each operators modify their function argument to produce some variant of that function. For example, use F2 to execute: (2;3)+4 You should see the error message +: type, which in this case means that + does not apply to data aggregates. Following the message is a line with a *, indicating suspended execution. Clear the suspension and return to the tutorial. Now use F2 to execute: (2;3)+¡4 Explain the result you see. What do you think the following expressions produce? Evaluate them to confirm your guesses. (2;3)+¡4 5 (2;3)+¡(4;5) Reduction, Scan, Outer Product, and Inner Product are not operators, strictly speaking: they do not accept all functions as operands. The ones they do accept are shown in Table 2-2. Because these character sequences look so much like derived functions, however, we will use the term operator to include these four as well as the primitive Each and Rank operators and user defined operators. Ex 10. Many of the symbols in should be familiar, but some may not be. For example, Ä (meta-d on the keyboard) and Ó (meta-s) denote the Minimum and Maximum functions, respectively, when used dyadically. Execute the following expressions: 3Ä5 3Ó5 Ä/1 2 3 4 5 Ó/1 2 3 4 5 +/1 2 3 4 5 Explain how the functions Ó/, Ä/, and +/ are variants of the functions Ó, Ä, and +. Feel free to experiment with other arguments. Remember, if you make error and execution is suspended, enter the right arrow (meta-]) to get out of it. 5. Defined Functions A function definition consists of a function header, followed by a colon, followed by either an expression, or an expression block, which is a series of expressions separated by semicolons and enclosed in braces and represents a sequence of statements to be executed. Function headers take the same forms as function call expressions (see Functions above), except that no argument may be omitted. A function header has the monadic form, dyadic form, or general form. The monadic form is the function name followed by the argument name, with the two names separated by at least one space. For example, if the function name is correlate then correlate a:{...} is a function definition with the monadic form of the header. The dyadic form of function header is the function name with one argument name on each side, with the names separated by at least one blank. For example: a correlate b:{...} is a function definition with the dyadic form of the header. The third form of function header is the general form, which is the function followed by a left brace, followed by a list of argument names separated by semicolons, and terminated with a left brace. For example: correlate{a;b;c}:{...} is a function definition with the general form of the header. In this example the function has three arguments. A function with one argument can be defined with either the monadic form of function header, or the general form, and analogously, functions with two arguments can be defined with either the dyadic form or general for. Regardless of which way they are defined, they can be called either way. Ex 8 provides an example of a defined function. The result of that function is the value of the (a;b;c). 6. Dependencies A dependency definition consists of a name (the name of the dependency), followed by a colon, followed by either an A+ expression, or an expression block. 7. Bracket Indexing A+ data objects are arrays, and bracket indexing is a way to select subarrays. Bracket indexing uses special syntax, whose form is x[a;b;É;c] where x represents a variable name and a, b,É,c denote expressions. The space between the left bracket and the first semicolon, between successive semicolons, and between the last semicolon and the right bracket, can be empty. Ex 16. This exercise takes us into the subject matter of the other language tutorials, but it is interesting to see what it means to leave the spaces in the bracket index expression empty. Execute the following: É3 4 and you will see a matrix with three rows and four columns, populated by the numbers 0 through 11. Execute each of the following and explain what you see: (É3 4)[0;0] (É3 4)[2;3] (É3 4)[1;1 3] (É3 4)[1;3 1] (É3 4)[1;] (É3 4)[;2] (É3 4)[;] 8. Strands Aggregate data objects can be formed by separating the individual data objects with semicolons and surrounding the collection of data objects and semicolons with a pair of parentheses. For example: (a;b;...;c) where a,b,...,c denote expressions. Any of these expressions can be function expressions. See Ex 8. 9. Precedence Rules The precedence rules in A+ are simple: all functions have equal precedence, whether primitive, defined, or derived all operators have equal precedence operators have higher precedence than functions the formation of numeric constants has higher precedence than operators. Ex 11. Execute the following: 1 2+3 4 The result indicates that the constant with the two numbers 1 and 2, and the constant with the two numbers 3 and 4, are formed before + is applied. Do you see how this is related to the above rules? 10. Right-to-Left Order of Execution The way to read A+ expressions is from left to right, like English. For the most part we also read mathematical notation from left to right, although not strictly because the notation is two-dimensional. To illustrate reading A+ expressions from left to right, consider the following examples. a+b+c Read as: "a plus the result of b plus c." x-ßy Read as: "x minus the reciprocal of y." As you can see, reading from left to right in the suggested style implies that execution takes place right to left. In the first example, to say "a plus the result of b plus c" means that b+c must be formed first, and then added to a. And in the second example, to say "x minus the reciprocal of y" means that ßy must be formed before it is subtracted from x. Of course, reading from left to right is not necessarily associated with execution from right to left. For example, the expression aßb+c is read left to right in conventional mathematical notation as well as A+, but the order of evaluation is different in the two; in mathematics a divided by b is formed and added to c, while in A+, a is divided by b+c. The order of execution is controlled by the relative precedence of the functions, or operations. In mathematics, divide has higher precedence than plus, which means that in aßb+c, divide is evaluated before plus. Another way to say that A+ expressions execute from right to left is that A+ has long right scope and short left scope. For example, consider: a+b-cße«f The arguments of the minus function are b on the left (short scope) and cße«f on the right (long scope.) The left argument is found by starting at the - symbol and moving to the left until the smallest possible complete subexpression is found. In this example it is simply the name b. If the first non-blank character to the left of the symbol had been a right parenthesis, then the left argument would have included everything to the left of the right parenthesis, up to the matching left parenthesis. For example, the left argument of minus in a+(xßb)-cße«f is xßb. The right argument is found by starting at the - symbol and moving to the right, all the way to the end of the expression, or until a semicolon is encountered, or until a right parenthesis, brace, or bracket is encountered whose matching left partner is to the left of the symbol. In the above example the right argument of minus is everything to the right. If the case of a+b-(cße)«f, the right argument is also everything to the right. However, for a+(b-cße)«f, the right argument is cße. 11. Control Statements 11a. Case Statement The form of a case statement is the word case, followed by an expression in parentheses, followed by one of two special expression sequences. The placement of semicolons must be as illustrated below. The point of the specification in the examples is that A+ control statements are actually compound expressions with results. xûcase (a) {0;"The case is 0"; 1;"The case is 1"; "The default case" } xûcase (a) {0;"The case is 0"; 1;"The case is 1"; } These expression blocks are of the form {case-expression0; value-expression0; case-expression1; value-expression1; . . . } In both of the above instances, the case statement is evaluated by first evaluating the expression in parentheses. The value of that expression is compared to the value of case-expression0. If they match, value-expression0 is evaluated and its value is the result of the case statement. If they do not match, the value of the expression in parentheses is compared to the value of case-expression1. If they match, value-expression1 is evaluated and its value is the result of the case statement. This pattern continues until the case-expression, value-expression pairs are exhausted. At that point the case statement either has one remaining expression (the first example above) or none. If there is one, it is evaluated and its value is the result of the case statement. If there is none, the result of the case statement is the Null. 11b. Do Statement The monadic form of the do statement is the word do, followed by an expression or expression block. The dyadic form is like the monadic form, except that a valid left argument expression appears to the left of the word do. There are two special forms recognized for the left argument. For example, evaluate each of the following: nû10 xûn do Õn n The specification of n is simply to get the example going. The point is that when the do statement is evaluated, n already has a value. The do statement prints the value of n each time it is evaluated. You might have expected to see a series of 10's, but you saw 0 through 9. The rule is that when the left argument is simply a variable name with an integer value, say k, that variable is successively given the values 0, 1,É,k-1 for the successive evaluations of the expression on the right. Finally, evaluating the last statement in the above sequence shows that n once again has its value (10) from before evaluation of the do statement. Basically the same behavior occurs when the left side of the do statement is a simple specification. For example: xû(nû10) do Õn n No other form of the left argument has this effect. For example: nû20 xû(n-15) do Õn 11c. If Statement The form of an if statement is the word if, followed by an expression in parentheses, followed by another expression or an expression block. 11d. If-Else Statement The form of an if-else statement is the word if, followed by an expression in parentheses1, followed by another expression or expression block, followed by the word else, followed by another expression or an expression block. 11e. While Statement The form of a while statement is the word while, followed by an expression in parentheses1, followed by another expression or an expression block. 12. Execution Stack References Execution stack references are &, &0, &1, etc. The symbol & can be used in a function definition to refer to that function. For example, a factorial function can be recursively defined in either of the two following ways: fact{n}: if (n>0) n«fact{n-1} else 1 fact{n}: if (n>0) n«&{n-1} else 1 When execution is suspended the objects on the execution stack can be referenced by &0 (top of stack), &1, etc. See the Dealing with Errors tutorial. 13. Well-Formed Expressions Basically, a well-formed expression is one that takes one of the forms described above, and in which all of the constituents are well-formed. The potential for complicated expressions is due to the fact that every one of these basic forms produces a result and can therefore be used as a constituent in other forms. In this regard A+ is very much like mathematical notation. The concept of the principal subexpression of an expression is useful for analysis. As execution of an expression proceeds in the manner described in Right-to-left Order of Execution, one can imagine that parts of the expression are executed and replaced with their results, and then some remaining parts are executed using these results, and are replaced with their results, and so on. Ultimately the execution comes to the last expression to be executed, which is called the principal subexpression. Once executed, its value is the value of the expression. If the principal subexpression is a function call expression or operator call expression, the function or derived function is called the principal function. For example, the principal subexpression of (a+bßc-d)*10«n is x*y, where x is the result of a+bßc-d and y is the result of 10«n. The power function * is the principal function. As a second example, the principal expression of (x+y;x-y) is (w;z), where w is the result x+y and z is the result of x-y. In this case we do not refer to a principal function. Knowing the principal subexpression often reveals the thrust of a complicated expression. Mathematical notation gives visual clues that usually point the reader directly to the principal subexpression. There are clues in A+ as well, but they are based largely on experience. Ex 12. In each row of Table 3-1, an expression is given together with its principal function or expression. Make sure you understand each case. Table 3-1: Well-Formed Expressions --------------------------------------------------------------- |Expression |Principal Function or Principal Expression | |===============|=============================================| |a+b-c«d |+ | |(a+b)-c«d |- | |fß«w |ß | |(x-y)[a*2] |w[z] | |(Ø+/¡w-a)/z |/ | |Ø+/¡w-a |Ø | |+/¡w-a |+/¡ | | (a+.«b)ÕaÊ.+b |Õ | | aÊ.+b |Ê.+ | |f{a;g«a;x-y*2} |f{a;t;s} | ---------------------------------------------------------------