dylan: categorical/imperative.org annotate

annotate categorical/imperative.org @ 2:b4de894a1e2e

initial import

author	Robert McIntyre <rlm@mit.edu>
date	Fri, 28 Oct 2011 00:03:05 -0700
parents
children

rev	line source
rlm@2	1 #+TITLE: LAMBDA: The Ultimate Imperative
rlm@2	2 #+AUTHOR: Guy Lewis Steele Jr. and Gerald Jay Sussman
rlm@2	3
rlm@2	4 * Abstract
rlm@2	5
rlm@2	6 We demonstrate how to model the following common programming constructs in
rlm@2	7 terms of an applicative order language similar to LISP:
rlm@2	8 - Simple Recursion
rlm@2	9 - Iteration
rlm@2	10 - Compound Statements and Expressions
rlm@2	11 - GO TO and Assignment
rlm@2	12 - Continuation—Passing
rlm@2	13 - Escape Expressions
rlm@2	14 - Fluid Variables
rlm@2	15 - Call by Name, Call by Need, and Call by Reference
rlm@2	16 The models require only (possibly self-referent) lambda application,
rlm@2	17 conditionals, and (rarely) assignment. No complex data structures such as
rlm@2	18 stacks are used. The models are transparent, involving only local syntactic
rlm@2	19 transformations.
rlm@2	20 Some of these models, such as those for GO TO and assignment, are already well
rlm@2	21 known, and appear in the work of Landin, Reynolds, and others. The models for
rlm@2	22 escape expressions, fluid variables, and call by need with side effects are
rlm@2	23 new. This paper is partly tutorial in intent, gathering all the models
rlm@2	24 together for purposes of context.
rlm@2	25 This report describes research done at the Artificial Intelligence Laboratory
rlm@2	26 of the Massachusetts Institute of Teehnology. Support for the laboratory's
rlm@2	27 artificial intelligence research is provided in part by the Advanced Research
rlm@2	28 Projects Agency of the Department of Defense under Office of Naval Research
rlm@2	29 contract N000l4-75-C-0643.
rlm@2	30
rlm@2	31 * Introduction
rlm@2	32
rlm@2	33 We catalogue a number of common programming constructs. For each
rlm@2	34 construct we examine "typical" usage in well-known programming languages, and
rlm@2	35 then capture the essence of the semantics of the construct in terms of a
rlm@2	36 common meta-language.
rlm@2	37 The lambda calculus {Note Alonzowins} is often used as such a meta-
rlm@2	38 language. Lambda calculus offers clean semantics, but it is clumsy because it
rlm@2	39 was designed to be a minimal language rather than a convenient one. All
rlm@2	40 lambda calculus "functions" must take exactly one "argument"; the only "data
rlm@2	41 type" is lambda expressions; and the only "primitive operation‘ is variable‘
rlm@2	42 substitution. While its utter simplicity makes lambda calculus ideal for
rlm@2	43 logicians, it is too primitive for use by programmers. The meta-language we
rlm@2	44 use is a programming language called SCHEME {Note Schemepaper) which is based
rlm@2	45 on lambda calculus.
rlm@2	46 SCHEME is a dialect of LISP. [McCarthy 62] It is an expression-
rlm@2	47 oriented, applicative order, interpreter-based language which allows one to
rlm@2	48 manipulate programs as data. It differs from most current dialects of LISP in
rlm@2	49 that it closes all lambda expressions in the environment of their definition
rlm@2	50 or declaration, rather than in the execution environment. {Note Closures}
rlm@2	51 This preserves the substitution semantics of lambda calculus, and has the
rlm@2	52 consequence that all variables are lexically scoped, as in ALGOL. [Naur 63]
rlm@2	53 Another difference is that SCHEME is implemented in such a way that tail-
rlm@2	54 recursions execute without net growth of the interpreter stack. {Note
rlm@2	55 Schemenote} We have chosen to use LISP syntax rather than, say, ALGOL syntax
rlm@2	56 because we want to treat programs as data for the purpose of describing
rlm@2	57 transformations on the code. LISP supplies names for the parts of an
rlm@2	58 executable expression and standard operators for constructing expressions and
rlm@2	59 extracting their components. The use of LISP syntax makes the structure of
rlm@2	60 such expressions manifest. We use ALGOL as an expository language, because it
rlm@2	61 is familiar to many people, but ALGOL is not sufficiently powerful to express
rlm@2	62 the necessary concepts; in particular, it does not allow functions to return
rlm@2	63 functions as values. we are thus forced to use a dialect of LISP in many
rlm@2	64 cases.
rlm@2	65 We will consider various complex programming language constructs and
rlm@2	66 show how to model them in terms of only a few simple ones. As far as possible
rlm@2	67 we will use only three control constructs from SCHEME: LAMBDA expressions, as
rlm@2	68 in LISP, which are Just functions with lexically scoped free variables;
rlm@2	69 LABELS, which allows declaration of mutually recursive procedures (Note
rlm@2	70 Labelsdef}; and IF, a primitive conditional expression. For more complex
rlm@2	71 modelling we will introduce an assignment primitive (ASET).i We will freely
rlm@2	72 assume the existence of other comon primitives, such as arithmetic functions.
rlm@2	73 The constructs we will examine are divided into four broad classes.
rlm@2	74 The first is sﬁnph?Lo0pl; this contains simple recursions and iterations, and
rlm@2	75 an introduction to the notion of continuations. The second is hmponﬂivo
rlm@2	76 Connrucls; this includes compound statements, G0 T0, and simple variable
rlm@2	77 assignments. The third is continuations, which encompasses the distinction between statements and expressions, escape operators (such as Landin's J-
rlm@2	78 operator [Landin 65] and Reynold's escape expression [Reynolds 72]). and fluid
rlm@2	79 (dynamically bound) variables. The fourth is Parameter Passing Mechanisms, such
rlm@2	80 as ALGOL call—by-name and FORTRAN call-by-location.
rlm@2	81 Some of the models presented here are already well-known, particularly
rlm@2	82 those for G0 T0 and assignment. [McCarthy 60] [Landin 65] [Reynolds 72]
rlm@2	83 Those for escape operators, fluid variables, and call-by-need with side
rlm@2	84 effects are new.
rlm@2	85 ** Simple Loops
rlm@2	86 By /simple loops/ we mean constructs which enable programs to execute the
rlm@2	87 same piece of code repeatedly in a controlled manner. Variables may be made
rlm@2	88 to take on different values during each repetition, and the number of
rlm@2	89 repetitions may depend on data given to the program.
rlm@2	90 *** Simple Recursion
rlm@2	91 One of the easiest ways to produce a looping control structure is to
rlm@2	92 use a recursive function, one which calls itself to perform a subcomputation.
rlm@2	93 For example, the familiar factorial function may be written recursively in
rlm@2	94 ALGOL: '
rlm@2	95 #+begin_src algol
rlm@2	96 integer procedure fact(n) value n: integer n:
rlm@2	97 fact := if n=0 then 1 else n*fact(n-1);
rlm@2	98 #+end_src
rlm@2	99
rlm@2	100 The invocation =fact(n)= computes the product of the integers from 1 to n using
rlm@2	101 the identity n!=n(n-1)! (n>0). If $n$ is zero, 1 is returned; otherwise =fact=:
rlm@2	102 calls itself recursively to compute $(n-1)!$ , then multiplies the result by $n$
rlm@2	103 and returns it.
rlm@2	104
rlm@2	105 This same function may be written in Clojure as follows:
rlm@2	106 #+begin_src clojure
rlm@2	107 (ns categorical.imperative)
rlm@2	108 (defn fact[n]
rlm@2	109 (if (= n 0) 1 (* n (fact (dec n))))
rlm@2	110 )
rlm@2	111 #+end_src
rlm@2	112
rlm@2	113 #+results:
rlm@2	114 : #'categorical.imperative/fact
rlm@2	115
rlm@2	116 Clojure does not require an assignment to the ``variable'' fan: to return a value
rlm@2	117 as ALGOL does. The IF primitive is the ALGOL if-then-else rendered in LISP
rlm@2	118 syntax. Note that the arithmetic primitives are prefix operators in SCHEME.
rlm@2	119
rlm@2	120 *** Iteration
rlm@2	121 There are many other ways to compute factorial. One important way is
rlm@2	122 through the use of /iteration/.
rlm@2	123 A comon iterative construct is the DO loop. The most general form we
rlm@2	124 have seen in any programming language is the MacLISP DO [Moon 74]. It permits
rlm@2	125 the simultaneous initialization of any number of control variables and the
rlm@2	126 simultaneous stepping of these variables by arbitrary functions at each
rlm@2	127 iteration step. The loop is terminated by an arbitrary predicate, and an
rlm@2	128 arbitrary value may be returned. The DO loop may have a body, a series of
rlm@2	129 expressions executed for effect on each iteration. A version of the MacLISP
rlm@2	130 DO construct has been adopted in SCHEME.
rlm@2	131
rlm@2	132 The general form of a SCHEME DO is:
rlm@2	133 #+begin_src nothing
rlm@2	134 (DO ((<var1> (init1> <stepl>)
rlm@2	135 ((var2> <init2> <step2>)
rlm@2	136 (<varn> <intn> (stepn>))
rlm@2	137 (<pred> (value>)
rlm@2	138 (optional body>)
rlm@2	139 #+end_src
rlm@2	140 The semantics of this are that the variables are bound and initialized to the
rlm@2	141 values of the <initi> expressions, which must all be evaluated in the
rlm@2	142 environment outside the D0; then the predicate <pred> is evaluated in the new
rlm@2	143 environment, and if TRUE, the (value) is evaluated and returned. Otherwise
rlm@2	144 the (optional body) is evaluated, then each of the steppers <stepi> is
rlm@2	145 evaluated in the current environment, all the variables made to have the
rlm@2	146 results as their values, the predicate evaluated again, and so on.
rlm@2	147 Using D0 loops in both ALGOL and SCHEME, we may express FACT by means
rlm@2	148 of iteration.
rlm@2	149 #+begin_src algol
rlm@2	150 integer procedure fact(n); value n; integer n:
rlm@2	151 begin
rlm@2	152 integer m, ans;
rlm@2	153 ans := 1;
rlm@2	154 for m := n step -l until 0 do ans := m*ans;
rlm@2	155 fact := ans;
rlm@2	156 end;
rlm@2	157 #+end_src
rlm@2	158
rlm@2	159 #+begin_src clojure
rlm@2	160 (in-ns 'categorical.imperative)
rlm@2	161 (defn fact-do[n]
rlm@2	162 )
rlm@2	163 #+end_src
rlm@2	164
rlm@2	165 Note that the SCHEME D0 loop in FACT has no body -- the stepping functions do
rlm@2	166 all the work. The ALGOL DO loop has an assignment in its body; because an
rlm@2	167 ALGOL DO loop can step only one variable, we need the assignment to step the
rlm@2	168 the variable "manually'.
rlm@2	169 In reality the SCHEME DO construct is not a primitive; it is a macro
rlm@2	170 which expands into a function which performs the iteration by tail—recursion.
rlm@2	171 Consider the following definition of FACT in SCHEME. Although it appears to
rlm@2	172 be recursive, since it "calls itself", it is entirely equivalent to the DO
rlm@2	173 loop given above, for it is the code that the DO macro expands into! It
rlm@2	174 captures the essence of our intuitive notion of iteration, because execution
rlm@2	175 of this program will not produce internal structures (e.g. stacks or variable
rlm@2	176 bindings) which increase in size with the number of iteration steps.
rlm@2	177
rlm@2	178 #+begin_src clojure
rlm@2	179 (in-ns 'categorical.imperative)
rlm@2	180 (defn fact-do-expand[n]
rlm@2	181 (let [fact1
rlm@2	182 (fn[m ans]
rlm@2	183 (if (zero? m) ans
rlm@2	184 (recur (dec m) (* m ans))))]
rlm@2	185 (fact1 n 1)))
rlm@2	186 #+end_src
rlm@2	187
rlm@2	188 From this we can infer a general way to express iterations in SCHEME in
rlm@2	189 a manner isomorphic to the HacLISP D0. The expansion of the general D0 loop
rlm@2	190 #+begin_src nothing
rlm@2	191 (DO ((<varl> <init1> <step1>)
rlm@2	192 (<var2> (init2) <step2>)
rlm@2	193 ...
rlm@2	194 (<varn> <initn> <stepn>))
rlm@2	195 (<pred> <va1ue>)
rlm@2	196 <body>)
rlm@2	197 #+end_src
rlm@2	198 is this:
rlm@2	199 #+begin_src nothing
rlm@2	200 (let [doloop
rlm@2	201 (fn [dummy <var1> <var2> ... <varn>)
rlm@2	202 (if <pred> <value>
rlm@2	203 (recur <body> <step1> <step2> ... <stepn>)))]
rlm@2	204
rlm@2	205 (doloop nil <init1> <init2> ... <initn>))
rlm@2	206 #+end_src
rlm@2	207 The identifiers =doloop= and =dummy= are chosen so as not to conflict with any
rlm@2	208 other identifiers in the program.
rlm@2	209 Note that, unlike most implementations of D0, there are no side effects
rlm@2	210 in the steppings of the iteration variables. D0 loops are usually modelled
rlm@2	211 using assignment statements. For example:
rlm@2	212 #+begin_src nothing
rlm@2	213 for r :1 0 step b until c do <statement>;
rlm@2	214 #+end_src
rlm@2	215
rlm@2	216 can be modelled as follows: [Naur 63]
rlm@2	217 #+begin_src nothing
rlm@2	218 begin
rlm@2	219 x := a;
rlm@2	220 L: if (x-c)*sign(n) > 0 then go to Endloop;
rlm@2	221 <statement>;
rlm@2	222 x := x+b;
rlm@2	223 go to L:
rlm@2	224 Endloop:
rlm@2	225 end;
rlm@2	226 #+end_src
rlm@2	227 Later we will see how such assignment statements can in general be
rlm@2	228 modelled without using side effects.
rlm@2	229
rlm@2	230 * Imperative Programming
rlm@2	231 Lambda calculus (and related languages, such as ``pure LISP'') is often
rlm@2	232 used for modelling the applicative constructs of programming languages.
rlm@2	233 However, they are generally thought of as inappropriate for modelling
rlm@2	234 imperative constructs. In this section we show how imperative constructs may
rlm@2	235 be modelled by applicative SCHEME constructs.
rlm@2	236 ** Compound Statements
rlm@2	237 The simplest kind of imperative construct is the statement sequencer,
rlm@2	238 for example the compound statement in ALGOL:
rlm@2	239 #+begin_src algol
rlm@2	240 begin
rlm@2	241 S1;
rlm@2	242 S2;
rlm@2	243 end
rlm@2	244 #+end_src
rlm@2	245
rlm@2	246 This construct has two interesting properties:
rlm@2	247 - (1) It performs statement S1 before S2, and so may be used for
rlm@2	248 sequencing.
rlm@2	249 - (2) S1 is useful only for its side effects. (In ALGOL, S2 must also
rlm@2	250 be a statement, and so is also useful only for side effects, but
rlm@2	251 other languages have compound expressions containing a statement
rlm@2	252 followed by an expression.)
rlm@2	253
rlm@2	254 The ALGOL compound statement may actually contain any number of statements,
rlm@2	255 but such statements can be expressed as a series of nested two-statement
rlm@2	256 compounds. That is:
rlm@2	257 #+begin_src algol
rlm@2	258 begin
rlm@2	259 S1;
rlm@2	260 S2;
rlm@2	261 ...
rlm@2	262 Sn-1;
rlm@2	263 Sn;
rlm@2	264 end
rlm@2	265 #+end_src
rlm@2	266 is equivalent to:
rlm@2	267
rlm@2	268 #+begin_src algol
rlm@2	269 begin
rlm@2	270 S1;
rlm@2	271 begin
rlm@2	272 S2;
rlm@2	273 begin
rlm@2	274 ...
rlm@2	275
rlm@2	276 begin
rlm@2	277 Sn-1;
rlm@2	278 Sn;
rlm@2	279 end;
rlm@2	280 end;
rlm@2	281 end:
rlm@2	282 end
rlm@2	283 #+end_src
rlm@2	284
rlm@2	285 It is not immediately apparent that this sequencing can be expressed in a
rlm@2	286 purely applicative language. We can, however, take advantage of the implicit
rlm@2	287 sequencing of applicative order evaluation. Thus, for example, we may write a
rlm@2	288 two-statement sequence as follows:
rlm@2	289
rlm@2	290 #+begin_src clojure
rlm@2	291 ((fn[dummy] S2) S1)
rlm@2	292 #+end_src
rlm@2	293
rlm@2	294 where =dummy= is an identifier not used in S2. From this it is
rlm@2	295 manifest that the value of S1 is ignored, and so is useful only for
rlm@2	296 side effects. (Note that we did not claim that S1 is expressed in a
rlm@2	297 purely applicative language, but only that the sequencing can be so
rlm@2	298 expressed.) From now on we will use the form =(BLOCK S1 S2)= as an
rlm@2	299 abbreviation for this expression, and =(BLOCK S1 S2...Sn-1 Sn)= as an
rlm@2	300 abbreviation for
rlm@2	301
rlm@2	302 #+begin_src algol
rlm@2	303 (BLOCK S1 (BLOCK S2 (BLOCK ... (BLOCK Sn-1 Sn)...)))
rlm@2	304 #+end_src
rlm@2	305
rlm@2	306 ** 2.2. The G0 TO Statement
rlm@2	307
rlm@2	308 A more general imperative structure is the compound statement with
rlm@2	309 labels and G0 T0s. Consider the following code fragment due to
rlm@2	310 Jacopini, taken from Knuth: [Knuth 74]
rlm@2	311 #+begin_src algol
rlm@2	312 begin
rlm@2	313 L1: if B1 then go to L2; S1;
rlm@2	314 if B2 then go to L2; S2;
rlm@2	315 go to L1;
rlm@2	316 L2: S3;
rlm@2	317 #+end_src
rlm@2	318
rlm@2	319 It is perhaps surprising that this piece of code can be /syntactically/
rlm@2	320 transformed into a purely applicative style. For example, in SCHEME we
rlm@2	321 could write:
rlm@2	322
rlm@2	323 #+begin_src clojure
rlm@2	324 (in-ns 'categorical.imperative)
rlm@2	325 (let
rlm@2	326 [L1 (fn[]
rlm@2	327 (if B1 (L2) (do S1
rlm@2	328 (if B2 (L2) (do S2 (L1))))))
rlm@2	329 L2 (fn[] S3)]
rlm@2	330 (L1))
rlm@2	331 #+end_src
rlm@2	332
rlm@2	333 As with the D0 loop, this transformation depends critically on SCHEME's
rlm@2	334 treatment of tail-recursion and on lexical scoping of variables. The labels
rlm@2	335 are names of functions of no arguments. In order to ‘go to" the labeled code,
rlm@2	336 we merely call the function named by that label.
rlm@2	337
rlm@2	338 ** 2.3. Simple Assignment
rlm@2	339 Of course, all this sequencing of statements is useless unless the
rlm@2	340 statements have side effects. An important side effect is assignment. For
rlm@2	341 example, one often uses assignment to place intermediate results in a named
rlm@2	342 location (i.e. a variable) so that they may be used more than once later
rlm@2	343 without recomputing them:
rlm@2	344
rlm@2	345 #+begin_src algol
rlm@2	346 begin
rlm@2	347 real a2, sqrtdisc;
rlm@2	348 a2 := 2*a;
rlm@2	349 sqrtdisc := sqrt(b^2 - 4ac);
rlm@2	350 root1 := (- b + sqrtdisc) / a2;
rlm@2	351 root2 := (- b - sqrtdisc) / a2;
rlm@2	352 print(root1);
rlm@2	353 print(root2);
rlm@2	354 print(root1 + root2);
rlm@2	355 end
rlm@2	356 #+end_src
rlm@2	357
rlm@2	358 It is well known that such naming of intermediate results may be accomplished
rlm@2	359 by calling a function and binding the formal parameter variables to the
rlm@2	360 results:
rlm@2	361
rlm@2	362 #+begin_src clojure
rlm@2	363 (fn [a2 sqrtdisc]
rlm@2	364 ((fn[root1 root2]
rlm@2	365 (do (println root1)
rlm@2	366 (println root2)
rlm@2	367 (println (+ root1 root2))))
rlm@2	368 (/ (+ (- b) sqrtdisc) a2)
rlm@2	369 (/ (- (- b) sqrtdisc) a2))
rlm@2	370
rlm@2	371 (* 2 a)
rlm@2	372 (sqrt (- (* b b) (* 4 a c))))
rlm@2	373 #+end_src
rlm@2	374 This technique can be extended to handle all simple variable assignments which
rlm@2	375 appear as statements in blocks, even if arbitrary G0 T0 statements also appear
rlm@2	376 in such blocks. {Note Mccarthywins}
rlm@2	377
rlm@2	378 For example, here is a program which uses G0 TO statements in the form
rlm@2	379 presented before; it determines the parity of a non-negative integer by
rlm@2	380 counting it down until it reaches zero.
rlm@2	381
rlm@2	382 #+begin_src algol
rlm@2	383 begin
rlm@2	384 L1: if a = 0 then begin parity := 0; go to L2; end;
rlm@2	385 a := a - 1;
rlm@2	386 if a = 0 then begin parity := 1; go to L2; end;
rlm@2	387 a := a - 1;
rlm@2	388 go to L1;
rlm@2	389 L2: print(parity);
rlm@2	390 #+end_src
rlm@2	391
rlm@2	392 This can be expressed in SCHEME:
rlm@2	393
rlm@2	394 #+begin_src clojure
rlm@2	395 (let
rlm@2	396 [L1 (fn [a parity]
rlm@2	397 (if (zero? a) (L2 a 0)
rlm@2	398 (L3 (dec a) parity)))
rlm@2	399 L3 (fn [a parity]
rlm@2	400 (if (zero? a) (L2 a 1)
rlm@2	401 (L1 (dec a) parity)))
rlm@2	402 L2 (fn [a parity]
rlm@2	403 (println parity))]
rlm@2	404 (L1 a parity))
rlm@2	405 #+end_src
rlm@2	406
rlm@2	407 The trick is to pass the set of variables which may be altered as arguments to
rlm@2	408 the label functions. {Note Flowgraph} It may be necessary to introduce new
rlm@2	409 labels (such as L3) so that an assignment may be transformed into the binding
rlm@2	410 for a function call. At worst, one may need as many labels as there are
rlm@2	411 statements (not counting the eliminated assignment and GO TO statements).
rlm@2	412
rlm@2	413 ** Compound Expressions '
rlm@2	414 At this point we are almost in a position to model the most general
rlm@2	415 form of compound statement. In LISP, this is called the 'PROG feature". In
rlm@2	416 addition to statement sequencing and G0 T0 statements, one can return a /value/
rlm@2	417 from a PROG by using the RETURN statement.
rlm@2	418
rlm@2	419 Let us first consider the simplest compound statement, which in SCHEME
rlm@2	420 we call BLOCK. Recall that
rlm@2	421 =(BLOCK s1 sz)= is defined to be =((lambda (dummy) s2) s1)=
rlm@2	422
rlm@2	423 Notice that this not only performs Sl before S2, but also returns the value of
rlm@2	424 S2. Furthermore, we defined =(BLOCK S1 S2 ... Sn)= so that it returns the value
rlm@2	425 of =Sn=. Thus BLOCK may be used as a compound expression, and models a LISP
rlm@2	426 PROGN, which is a PROG with no G0 T0 statements and an implicit RETURN of the
rlm@2	427 last ``statement'' (really an expression).
rlm@2	428
rlm@2	429 Host LISP compilers compile D0 expressions by macro-expansion. We have
rlm@2	430 already seen one way to do this in SCHEME using only variable binding. A more
rlm@2	431 common technique is to expand the D0 into a PROG, using variable assignments
rlm@2	432 instead of bindings. Thus the iterative factorial program:
rlm@2	433
rlm@2	434 #+begin_src clojure
rlm@2	435 (oarxnz FACT .
rlm@2	436 (LAMaoA (n) .
rlm@2	437 (D0 ((M N (- H 1))
rlm@2	438 (Ans 1 (* M Ans)))
rlm@2	439 ((- H 0) A"$))))
rlm@2	440 #+end_src
rlm@2	441
rlm@2	442 would expand into:
rlm@2	443
rlm@2	444 #+begin_src clojure
rlm@2	445 (DEFINE FACT
rlm@2	446 . (LAMeoA (M)
rlm@2	447 (PRO6 (M Ans)
rlm@2	448 (sszro M n
rlm@2	449 Ans 1) ~
rlm@2	450 LP (tr (- M 0) (RETURN Ans))
rlm@2	451 (ssero M (- n 1)
rlm@2	452 Ans (' M Ans))
rlm@2	453 (60 LP))))
rlm@2	454 #+end_src
rlm@2	455
rlm@2	456 where SSETQ is a simultaneous multiple assignment operator. (SSETQ is not a
rlm@2	457 SCHEME (or LISP) primitive. It can be defined in terms of a single assignment
rlm@2	458 operator, but we are more interested here in RETURN than in simultaneous
rlm@2	459 assignment. The SSETQ's will all be removed anyway and modelled by lambda
rlm@2	460 binding.) We can apply the same technique we used before to eliminate G0 T0
rlm@2	461 statements and assignments from compound statements:
rlm@2	462
rlm@2	463 #+begin_src clojure
rlm@2	464 (DEFINE FACT
rlm@2	465 (LAHBOA (I)
rlm@2	466 (LABELS ((L1 (LAManA (M Ans)
rlm@2	467 (LP n 1)))
rlm@2	468 (LP (LAMaoA (M Ans)
rlm@2	469 (IF (- M o) (nztunn Ans)
rlm@2	470 (£2 H An$))))
rlm@2	471 (L2 (LAMaoA (M An )
rlm@2	472 (LP (- " 1) (' H ﬂN$)))))
rlm@2	473 (L1 NIL NlL))))
rlm@2	474 #+end_src clojure
rlm@2	475
rlm@2	476 We still haven't done anything about RETURN. Let's see...
rlm@2	477 - ==> the value of (FACT 0) is the value of (Ll NIL NIL)
rlm@2	478 - ==> which is the value of (LP 0 1)
rlm@2	479 - ==> which is the value of (IF (= 0 0) (RETURN 1) (L2 0 1))
rlm@2	480 - ==> which is the value of (RETURN 1)
rlm@2	481
rlm@2	482 Notice that if RETURN were the /identity
rlm@2	483 function/ (LAMBDA (X) X), we would get the right answer. This is in fact a
rlm@2	484 general truth: if we Just replace a call to RETURN with its argument, then
rlm@2	485 our old transformation on compound statements extends to general compound
rlm@2	486 expressions, i.e. PROG.
rlm@2	487
rlm@2	488 * Continuations
rlm@2	489 Up to now we have thought of SCHEME's LAMBDA expressions as functions,
rlm@2	490 and of a combination such as =(G (F X Y))= as meaning ``apply the function F to
rlm@2	491 the values of X and Y, and return a value so that the function G can be
rlm@2	492 applied and return a value ...'' But notice that we have seldom used LAMBDA
rlm@2	493 expressions as functions. Rather, we have used them as control structures and
rlm@2	494 environment modifiers. For example, consider the expression:
rlm@2	495 =(BLOCK (PRINT 3) (PRINT 4))=
rlm@2	496
rlm@2	497 This is defined to be an abbreviation for:
rlm@2	498 =((LAMBDA (DUMMY) (PRINT 4)) (PRINT 3))=
rlm@2	499
rlm@2	500 We do not care about the value of this BLOCK expression; it follows that we
rlm@2	501 do not care about the value of the =(LAMBDA (DUMMY) ...)=. We are not using
rlm@2	502 LAMBDA as a function at all.
rlm@2	503
rlm@2	504 It is possible to write useful programs in terms of LAHBDA expressions
rlm@2	505 in which we never care about the value of /any/ lambda expression. We have
rlm@2	506 already demonstrated how one could represent any "FORTRAN" program in these
rlm@2	507 terms: all one needs is PROG (with G0 and SETQ), and PRINT to get the answers
rlm@2	508 out. The ultimate generalization of this imperative programing style is
rlm@2	509 /continuation-passing/. (Note Churchwins} .
rlm@2	510
rlm@2	511 ** Continuation-Passing Recursion
rlm@2	512 Consider the following alternative definition of FACT. It has an extra
rlm@2	513 argument, the continuation, which is a function to call with the answer, when
rlm@2	514 we have it, rather than return a value
rlm@2	515
rlm@2	516 #+begin_src algol.
rlm@2	517 procedure Inc!(n, c); value n, c;
rlm@2	518 integer n: procedure c(integer value);
rlm@2	519 if n-=0 then c(l) else
rlm@2	520 begin
rlm@2	521 procedure l(!mp(d) value a: integer a;
rlm@2	522 c(mm);
rlm@2	523 Iacl(n-1, romp):
rlm@2	524 end;
rlm@2	525 #+end_src
rlm@2	526
rlm@2	527 #+begin_src clojure
rlm@2	528 (in-ns 'categorical.imperative)
rlm@2	529 (defn fact-cps[n k]
rlm@2	530 (if (zero? n) (k 1)
rlm@2	531 (recur (dec n) (fn[a] (k (* n a))))))
rlm@2	532 #+end_src clojure
rlm@2	533 It is fairly clumsy to use this version of‘ FACT in ALGOL; it is necessary to
rlm@2	534 do something like this:
rlm@2	535
rlm@2	536 #+begin_src algol
rlm@2	537 begin
rlm@2	538 integer ann
rlm@2	539 procedure :emp(x); value 2:; integer x;
rlm@2	540 ans :1 x;
rlm@2	541 ]'act(3. temp);
rlm@2	542 comment Now the variable "am" has 6;
rlm@2	543 end;
rlm@2	544 #+end_src
rlm@2	545
rlm@2	546 Procedure =fact= does not return a value, nor does =temp=; we must use a side
rlm@2	547 effect to get the answer out.
rlm@2	548
rlm@2	549 =FACT= is somewhat easier to use in SCHEME. We can call it like an
rlm@2	550 ordinary function in SCHEME if we supply an identity function as the second
rlm@2	551 argument. For example, =(FACT 3 (LAMBDA (X) X))= returns 6. Alternatively, we
rlm@2	552 could write =(FACT 3 (LAHBDA (X) (PRINT X)))=; we do not care about the value
rlm@2	553 of this, but about what gets printed. A third way to use the value is to
rlm@2	554 write
rlm@2	555 =(FACT 3 (LAMBDA (x) (SQRT x)))=
rlm@2	556 instead of
rlm@2	557 =(SQRT (FACT 3 (LAMBDA (x) x)))=
rlm@2	558
rlm@2	559 In either of these cases we care about the value of the continuation given to
rlm@2	560 FACT. Thus we care about the value of FACT if and only if we care about the
rlm@2	561 value of its continuation!
rlm@2	562
rlm@2	563 We can redefine other functions to take continuations in the same way.
rlm@2	564 For example, suppose we had arithmetic primitives which took continuations; to
rlm@2	565 prevent confusion, call the version of "+" which takes a continuation '++",
rlm@2	566 etc. Instead of writing
rlm@2	567 =(- (+ B Z)(* 4 A C))=
rlm@2	568 we can write
rlm@2	569 #+begin_src clojure
rlm@2	570 (in-ns 'categorical.imperative)
rlm@2	571 (defn enable-continuation "Makes f take a continuation as an additional argument."[f]
rlm@2	572 (fn[& args] ((fn[k] (k (apply f (reverse (rest (reverse args)))))) (last args)) ))
rlm@2	573 (def +& (enable-continuation +))
rlm@2	574 (def & (enable-continuation ))
rlm@2	575 (def -& (enable-continuation -))
rlm@2	576
rlm@2	577
rlm@2	578 (defn quadratic[a b c k]
rlm@2	579 (*& b b
rlm@2	580 (fn[x] (*& 4 a c
rlm@2	581 (fn[y]
rlm@2	582 (-& x y k))))))
rlm@2	583 #+end_src
rlm@2	584
rlm@2	585 where =k= is the continuation for the entire expression.
rlm@2	586
rlm@2	587 This is an obscure way to write an algebraic expression, and we would
rlm@2	588 not advise writing code this way in general, but continuation-passing brings
rlm@2	589 out certain important features of the computation:
rlm@2	590 - [1] The operations to be performed appear in the order in which they are
rlm@2	591 performed. In fact, they /must/ be performed in this
rlm@2	592 order. Continuation- passing removes the need for the rule about
rlm@2	593 left-to-right argument evaluation{Note Evalorder)
rlm@2	594 - [2] In the usual applicative expression there are two implicit
rlm@2	595 temporary values: those of =(* B B)= and =(* 4 A C)=. The first of
rlm@2	596 these values must be preserved over the computation of the second,
rlm@2	597 whereas the second is used as soon as it is produced. These facts
rlm@2	598 are /manifest/ in the appearance and use of the variable X and Y in
rlm@2	599 the continuation-passing version.
rlm@2	600
rlm@2	601 In short, the continuation-passing version specifies /exactly/ and
rlm@2	602 explicitly what steps are necessary to compute the value of the
rlm@2	603 expression. One can think of conventional functional application
rlm@2	604 for value as being an abbreviation for the more explicit
rlm@2	605 continuation-passing style. Alternatively, one can think of the
rlm@2	606 interpreter as supplying to each function an implicit default
rlm@2	607 continuation of one argument. This continuation will receive the
rlm@2	608 value of the function as its argument, and then carry on the
rlm@2	609 computation. In an interpreter this implicit continuation is
rlm@2	610 represented by the control stack mechanism for function returns.
rlm@2	611 Now consider what computational steps are implied by:
rlm@2	612
rlm@2	613 =(LAMBDA (A B C ...) (F X Y Z ...))= when we "apply" the LAMBDA
rlm@2	614 expression we have some values to apply it to; we let the names A,
rlm@2	615 B, C ... refer to these values. We then determine the values of X,
rlm@2	616 Y, Z ... and pass these values (along with "the buck",
rlm@2	617 i.e. control!) to the lambda expression F (F is either a lambda
rlm@2	618 expression or a name for one). Passing control to F is an
rlm@2	619 unconditional transfer. (Note Jrsthack) {Note Hewitthack) Note that
rlm@2	620 we want values from X, Y, Z, ... If these are simple expressions,
rlm@2	621 such as variables, constants, or LAMBDA expressions, the evaluation
rlm@2	622 process is trivial, in that no temporary storage is required. In
rlm@2	623 pure continuation-passing style, all evaluations are trivial: no
rlm@2	624 combination is nested within another, and therefore no ‘hidden
rlm@2	625 temporaries" are required. But if X is a combination, say (G P Q),
rlm@2	626 then we want to think of G as a function, because we want a value
rlm@2	627 from it, and we will need an implicit temporary place to keep the
rlm@2	628 result while evaluating Y and Z. (An interpreter usually keeps these
rlm@2	629 temporary places in the control stack!) On the other hand, we do not
rlm@2	630 necessarily need a value from F. This is what we mean by tail-
rlm@2	631 recursion: F is called tail-recursively, whereas G is not. A better
rlm@2	632 name for tail-recursion would be "tail-transfer", since no real
rlm@2	633 recursion is implied. This is why we have made such a fuss about
rlm@2	634 tail-recursion: it can be used for transfer of control without
rlm@2	635 making any commitment about whether the expression expected to
rlm@2	636 return a value. Thus it can be used to model statement-like control
rlm@2	637 structures. Put another way, tail—recursion does not require a
rlm@2	638 control stack as nested recursion does. In our models of iteration
rlm@2	639 and imperative style all the LAMBDA expressions used for control (to
rlm@2	640 simulate GO statements, for example) are called in tail-recursive
rlm@2	641 fashion.
rlm@2	642
rlm@2	643 ** Escape Expressions
rlm@2	644 Reynolds [Reynolds 72] defines the construction
rlm@2	645 = escape x in r =
rlm@2	646
rlm@2	647 to evaluate the expression $r$ in an environment such that the variable $x$ is
rlm@2	648 bound to an escape function. If the escape function is never applied, then
rlm@2	649 the value of the escape expression is the value of $r$. If the escape function
rlm@2	650 is applied to an argument $a$, however, then evaluation of $r$ is aborted and the
rlm@2	651 escape expression returns $a$. {Note J-operator} (Reynolds points out that
rlm@2	652 this definition is not quite accurate, since the escape function may be called
rlm@2	653 even after the escape expression has returned a value; if this happens, it
rlm@2	654 “returns again"!)
rlm@2	655
rlm@2	656 As an example of the use of an escape expression, consider this
rlm@2	657 procedure to compute the harmonic mean of an array of numbers. If any of the
rlm@2	658 numbers is zero, we want the answer to be zero. We have a function hannaunl
rlm@2	659 which will sum the reciprocals of numbers in an array, or call an escape
rlm@2	660 function with zero if any of the numbers is zero. (The implementation shown
rlm@2	661 here is awkward because ALGOL requires that a function return its value by
rlm@2	662 assignment.)
rlm@2	663 #+begin_src algol
rlm@2	664 begin
rlm@2	665 real procedure Imrmsum(a, n. escfun)§ p
rlm@2	666 real array at integer n; real procedure esciun(real);
rlm@2	667 begin
rlm@2	668 real sum;
rlm@2	669 sum := 0;
rlm@2	670 for i :1 0 until n-l do
rlm@2	671 begin
rlm@2	672 if a[i]=0 then cscfun(0);
rlm@2	673 sum :1 sum ¢ I/a[i];
rlm@2	674 enm
rlm@2	675 harmsum SI sum;
rlm@2	676 end: .
rlm@2	677 real array b[0:99]:
rlm@2	678 print(escape x in I00/hm-m.mm(b, 100, x));
rlm@2	679 end
rlm@2	680 #+end_src
rlm@2	681
rlm@2	682 If harmsum exits normally, the number of elements is divided by the sum and
rlm@2	683 printed. Otherwise, zero is returned from the escape expression and printed
rlm@2	684 without the division ever occurring.
rlm@2	685 This program can be written in SCHEME using the built-in escape
rlm@2	686 operator =CATCH=:
rlm@2	687
rlm@2	688 #+begin_src clojure
rlm@2	689 (in-ns 'categorical.imperative)
rlm@2	690 (defn harmonic-sum[coll escape]
rlm@2	691 ((fn [i sum]
rlm@2	692 (cond (= i (count coll) ) sum
rlm@2	693 (zero? (nth coll i)) (escape 0)
rlm@2	694 :else (recur (inc i) (+ sum (/ 1 (nth coll i))))))
rlm@2	695 0 0))
rlm@2	696
rlm@2	697 #+end_src
rlm@2	698
rlm@2	699 This actually works, but elucidates very little about the nature of ESCAPE.
rlm@2	700 We can eliminate the use of CATCH by using continuation-passing. Let us do
rlm@2	701 for HARMSUM what we did earlier for FACT. Let it take an extra argument C,
rlm@2	702 which is called as a function on the result.
rlm@2	703
rlm@2	704 #+begin_src clojure
rlm@2	705 (in-ns 'categorical.imperative)
rlm@2	706 (defn harmonic-sum-escape[coll escape k]
rlm@2	707 ((fn[i sum]
rlm@2	708 (cond (= i (count coll)) (k sum)
rlm@2	709 (zero? (nth coll i)) (escape 0)
rlm@2	710 (recur (inc i) (+ sum (/ 1 (nth coll i))))))
rlm@2	711 0 0))
rlm@2	712
rlm@2	713 (let [B (range 0 100)
rlm@2	714 after-the-catch println]
rlm@2	715 (harmonic-sum-escape B after-the-catch (fn[y] (after-the-catch (/ 100 y)))))
rlm@2	716
rlm@2	717 #+end_src
rlm@2	718
rlm@2	719 Notice that if we use ESCFUN, then C does not get called. In this way the
rlm@2	720 division is avoided. This example shows how ESCFUN may be considered to be an
rlm@2	721 "alternate continuation".
rlm@2	722
rlm@2	723 ** Dynamic Variable Scoping
rlm@2	724 In this section we will consider the problem of dynamically scoped, or
rlm@2	725 "fluid", variables. These do not exist in ALGOL, but are typical of many LISP
rlm@2	726 implementations, ECL, and APL. He will see that fluid variables may be
rlm@2	727 modelled in more than one way, and that one of these is closely related to
rlm@2	728 continuation—pass1ng.
rlm@2	729
rlm@2	730 *** Free (Global) Variables
rlm@2	731 Consider the following program to compute square roots:
rlm@2	732
rlm@2	733 #+begin_src clojure
rlm@2	734 (defn sqrt[x tolerance]
rlm@2	735 (
rlm@2	736 (DEFINE soar
rlm@2	737 (LAHBDA (x EPSILON)
rlm@2	738 (Pace (ANS)
rlm@2	739 (stro ANS 1.0)
rlm@2	740 A (coup ((< (ADS (-s x (~s ANS ANS))) EPSILON)
rlm@2	741 (nzruau ANS))) '
rlm@2	742 (sero ANS (//s (+5 x (//s x ANS)) 2.0))
rlm@2	743 (60 A)))) .
rlm@2	744 This function takes two arguments: the radicand and the numerical tolerance
rlm@2	745 for the approximation. Now suppose we want to write a program QUAD to compute
rlm@2	746 solutions to a quadratic equation; p
rlm@2	747 (perms QUAD
rlm@2	748 (LAHDDA (A a c)
rlm@2	749 ((LAHBDA (A2 sonmsc) '
rlm@2	750 (LIST (/ (+ (- a) SQRTDISC) AZ)
rlm@2	751 (/ (- (- B) SORTOISC) AZ)))
rlm@2	752 (' 2 A)
rlm@2	753 - (SORT (- (9 D 2) (' 4 A C)) (tolerance>))))
rlm@2	754 #+end_src
rlm@2	755 It is not clear what to write for (tolerance). One alternative is to pick
rlm@2	756 some tolerance for use by QUAD and write it as a constant in the code. This
rlm@2	757 is undesirable because it makes QUAD inflexible and hard to change. _Another
rlm@2	758 is to make QUAD take an extra argument and pass it to SQRT:
rlm@2	759 (DEFINE QUAD
rlm@2	760 (LANODA (A 8 C EPSILON)
rlm@2	761 (soar ... EPSILON) ...))
rlm@2	762 This is undesirable because EPSILDN is not really part of the problem QUAD is
rlm@2	763 supposed to solve, and we don't want the user to have to provide it.
rlm@2	764 Furthermore, if QUAD were built into some larger function, and that into
rlm@2	765 another, all these functions would have to pass EPSILON down as an extra
rlm@2	766 argument. A third.possibi1ity would be to pass the SQRT function as an
rlm@2	767 argument to QUAD (don't laugh!), the theory being to bind EPSILON at the
rlm@2	768 appropriate level like this: A U‘ A
rlm@2	769 (QUAD 3 A 5 (LAMBDA (X) (SORT X <toIerance>)))
rlm@2	770 where we define QUAD as:
rlm@2	771 (DEFINE QUAD
rlm@2	772 (LAMBDA (A a c soar) ...))

Mercurial > dylan

annotate categorical/imperative.org @ 2:b4de894a1e2e