You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 12_language.md
+23-29Lines changed: 23 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ quote}}
14
14
15
15
Building your own ((programming language)) is surprisingly easy (as long as you do not aim too high) and very enlightening.
16
16
17
-
The main thing I want to show in this chapter is that there is no ((magic)) involved in building a programming language. I've often felt that some human inventions were so immensely clever and complicated that I'd never be able to understand them. But with a little reading and experimenting, they often turn out to be quite mundane.
17
+
The main thing I want to show in this chapter is that there's no ((magic)) involved in building a programming language. I've often felt that some human inventions were so immensely clever and complicated that I'd never be able to understand them. But with a little reading and experimenting, they often turn out to be quite mundane.
18
18
19
19
{{index "Egg language", [abstraction, "in Egg"]}}
20
20
@@ -49,7 +49,7 @@ do(define(x, 10),
49
49
50
50
{{index block, [syntax, "of Egg"]}}
51
51
52
-
The ((uniformity)) of the ((Egg language)) means that things that are ((operator))s in JavaScript (such as `>`) are normal bindings in this language, applied just like other ((function))s. And since the syntax has no concept of a block, we need a `do` construct to represent doing multiple things in sequence.
52
+
The ((uniformity)) of the ((Egg language)) means that things that are ((operator))s in JavaScript (such as `>`) are normal bindings in this language, applied just like other ((function))s. Since the syntax has no concept of a block, we need a `do` construct to represent doing multiple things in sequence.
Such a data structure is called a _((syntax tree))_. If you imagine the objects as dots and the links between them as lines between those dots, it has a ((tree))like shape. The fact that expressions contain other expressions, which in turn might contain more expressions, is similar to the way tree branches split and split again.
77
+
Such a data structure is called a _((syntax tree))_. If you imagine the objects as dots and the links between them as lines between those dots, as shown in the following diagram, the structure has a ((tree))like shape. The fact that expressions contain other expressions, which in turn might contain more expressions, is similar to the way tree branches split and split again.
78
78
79
79
{{figure {url: "img/syntax_tree.svg", alt: "A diagram showing the structure of the syntax tree for the example program. The root is labeled 'do' and has two children, one labeled 'define' and one labeled 'if'. Those in turn have more children, describing their content.", width: "5cm"}}}
80
80
@@ -92,7 +92,7 @@ Fortunately, this problem can be solved very well by writing a parser function t
We define a function `parseExpression`, which takes a string as input and returns an object containing the data structure for the expression at the start of the string, along with the part of the string left after parsing this expression. When parsing subexpressions (the argument to an application, for example), this function can be called again, yielding the argument expression as well as the text that remains. This text may in turn contain more arguments or may be the closing parenthesis that ends the list of arguments.
95
+
We define a function `parseExpression` that takes a string as input. It returns an object containing the data structure for the expression at the start of the string, along with the part of the string left after parsing this expression. When parsing subexpressions (the argument to an application, for example), this function can be called again, yielding the argument expression as well as the text that remains. This text may in turn contain more arguments or may be the closing parenthesis that ends the list of arguments.
96
96
97
97
This is the first part of the parser:
98
98
@@ -122,11 +122,11 @@ function skipSpace(string) {
Because Egg, like JavaScript, allows any amount of whitespace between its elements, we have to repeatedly cut the whitespace off the start of the program string. That is what the `skipSpace` function helps with.
125
+
Because Egg, like JavaScript, allows any amount of whitespace between its elements, we have to repeatedly cut the whitespace off the start of the program string. The `skipSpace` function helps with this.
After skipping any leading space, `parseExpression` uses three ((regular expression))s to spot the three atomic elements that Egg supports: strings, numbers, and words. The parser constructs a different kind of data structure depending on which one matches. If the input does not match one of these three forms, it is not a valid expression, and the parser throws an error. We use the `SyntaxError` constructor here. This is an exception class defined by the standard, like `Error`, but more specific.
129
+
After skipping any leading space, `parseExpression` uses three ((regular expression))s to spot the three atomic elements that Egg supports: strings, numbers, and words. The parser constructs a different kind of data structure depending on which expression matches. If the input does not match one of these three forms, it is not a valid expression, and the parser throws an error. We use the `SyntaxError` constructor here. This is an exception class defined by the standard, like `Error`, but more specific.
130
130
131
131
{{index "parseApply function"}}
132
132
@@ -155,13 +155,9 @@ function parseApply(expr, program) {
155
155
}
156
156
```
157
157
158
-
{{index parsing}}
159
-
160
-
If the next character in the program is not an opening parenthesis, this is not an application, and `parseApply` returns the expression it was given.
158
+
{{index parsing, recursion}}
161
159
162
-
{{index recursion}}
163
-
164
-
Otherwise, it skips the opening parenthesis and creates the ((syntax tree)) object for this application expression. It then recursively calls `parseExpression` to parse each argument until a closing parenthesis is found. The recursion is indirect, through `parseApply` and `parseExpression` calling each other.
160
+
If the next character in the program is not an opening parenthesis, this is not an application, and `parseApply` returns the expression it was given. Otherwise, it skips the opening parenthesis and creates the ((syntax tree)) object for this application expression. It then recursively calls `parseExpression` to parse each argument until a closing parenthesis is found. The recursion is indirect, through `parseApply` and `parseExpression` calling each other.
165
161
166
162
Because an application expression can itself be applied (such as in `multiplier(2)(1)`), `parseApply` must, after it has parsed an application, call itself again to check whether another pair of parentheses follows.
167
163
@@ -227,21 +223,21 @@ function evaluate(expr, scope) {
227
223
228
224
{{index "literal expression", scope}}
229
225
230
-
The evaluator has code for each of the ((expression)) types. A literal value expression produces its value. (For example, the expression `100`just evaluates to the number 100.) For a binding, we must check whether it is actually defined in the scope and, if it is, fetch the binding's value.
226
+
The evaluator has code for each of the ((expression)) types. A literal value expression produces its value. (For example, the expression `100` evaluates to the number 100.) For a binding, we must check whether it is actually defined in the scope and, if it is, fetch the binding's value.
231
227
232
228
{{index [function, application]}}
233
229
234
-
Applications are more involved. If they are a ((special form)), like `if`, we do not evaluate anything and pass the argument expressions, along with the scope, to the function that handles this form. If it is a normal call, we evaluate the operator, verify that it is a function, and call it with the evaluated arguments.
230
+
Applications are more involved. If they are a ((special form)), like `if`, we do not evaluate anything—we just and pass the argument expressions, along with the scope, to the function that handles this form. If it is a normal call, we evaluate the operator, verify that it is a function, and call it with the evaluated arguments.
235
231
236
-
We use plain JavaScript function values to represent Egg's function values. We will come back to this [later](language#egg_fun), when the special form called `fun` is defined.
232
+
We use plain JavaScript function values to represent Egg's function values. We will come back to this [later](language#egg_fun), when the special form `fun` is defined.
The recursive structure of `evaluate` resembles the similar structure of the parser, and both mirror the structure of the language itself. It would also be possible to combine the parser and the evaluator into one function, and evaluate during parsing. But splitting them up this way makes the program clearer and more flexible.
236
+
The recursive structure of `evaluate` resembles the structure of the parser, and both mirror the structure of the language itself. It would also be possible to combine the parser and the evaluator into one function and evaluate during parsing, but splitting them up this way makes the program clearer and more flexible.
241
237
242
238
{{index "Egg language", interpretation}}
243
239
244
-
This is really all that is needed to interpret Egg. It is that simple. But without defining a few special forms and adding some useful values to the ((environment)), you can't do much with this language yet.
240
+
This is really all that's needed to interpret Egg. It's that simple. But without defining a few special forms and adding some useful values to the ((environment)), you can't do much with this language yet.
245
241
246
242
## Special forms
247
243
@@ -267,11 +263,11 @@ Egg's `if` construct expects exactly three arguments. It will evaluate the first
267
263
268
264
{{index Boolean}}
269
265
270
-
Egg also differs from JavaScript in how it handles the condition value to `if`. It will not treat things like zero or the empty string as false, only the precise value `false`.
266
+
Egg also differs from JavaScript in how it handles the condition value to `if`. It will treat only the value `false` as false, not things like zero or the empty string.
271
267
272
268
{{index "short-circuit evaluation"}}
273
269
274
-
The reason we need to represent `if` as a special form, rather than a regular function, is that all arguments to functions are evaluated before the function is called, whereas `if` should evaluate only _either_ its second or its third argument, depending on the value of the first.
270
+
The reason we need to represent `if` as a special form rather than a regular function is that all arguments to functions are evaluated before the function is called, whereas `if` should evaluate only _either_ its second or its third argument, depending on the value of the first.
To supply basic ((arithmetic)) and ((comparison)) ((operator))s, we will also add some function values to the ((scope)). In the interest of keeping the code short, we'll use `Function` to synthesize a bunch of operator functions in a loop, instead of defining them individually.
341
+
To supply basic ((arithmetic)) and ((comparison)) ((operator))s, we will also add some function values to the ((scope)). In the interest of keeping the code short, we'll use `Function` to synthesize a bunch of operator functions in a loop instead of defining them individually.
346
342
347
343
```{includeCode: true}
348
344
for (let op of ["+", "-", "*", "/", "==", "<", ">"]) {
349
345
topScope[op] = Function("a, b", `return a ${op} b;`);
350
346
}
351
347
```
352
348
353
-
A way to ((output)) values is also useful, so we'll wrap `console.log` in a function and call it `print`.
349
+
It is also useful to have a way to ((output)) values, so we'll wrap `console.log` in a function and call it `print`.
354
350
355
351
```{includeCode: true}
356
352
topScope.print = value => {
@@ -387,17 +383,15 @@ do(define(total, 0),
387
383
388
384
{{index "summing example", "Egg language"}}
389
385
390
-
This is the program we've seen several times before, which computes the sum of the numbers 1 to 10, expressed in Egg. It is clearly uglier than the equivalent JavaScript program—but not bad for a language implemented in less than 150 ((lines of code)).
386
+
This is the program we've seen several times before that computes the sum of the numbers 1 to 10, expressed in Egg. It is clearly uglier than the equivalent JavaScript program—but not bad for a language implemented in less than 150 ((lines of code)).
391
387
392
388
{{id egg_fun}}
393
389
394
390
## Functions
395
391
396
392
{{index function, "Egg language"}}
397
393
398
-
A programming language without functions is a poor programming language indeed.
399
-
400
-
Fortunately, it isn't hard to add a `fun` construct, which treats its last argument as the function's body and uses all arguments before that as the names of the function's parameters.
394
+
A programming language without functions is a poor programming language indeed. Fortunately, it isn't hard to add a `fun` construct, which treats its last argument as the function's body and uses all arguments before that as the names of the function's parameters.
401
395
402
396
```{includeCode: true}
403
397
specialForms.fun = (args, scope) => {
@@ -460,15 +454,15 @@ Traditionally, ((compilation)) involves converting the program to ((machine code
It would be possible to write an alternative ((evaluation)) strategy for Egg, one that first converts the program to a JavaScript program, uses `Function` to invoke the JavaScript compiler on it, and then runs the result. When done right, this would make Egg run very fast while still being quite simple to implement.
457
+
It would be possible to write an alternative ((evaluation)) strategy for Egg, one that first converts the program to a JavaScript program, uses `Function` to invoke the JavaScript compiler on it, and runs the result. When done right, this would make Egg run very fast while still being quite simple to implement.
464
458
465
459
If you are interested in this topic and willing to spend some time on it, I encourage you to try to implement such a compiler as an exercise.
466
460
467
461
## Cheating
468
462
469
463
{{index "Egg language"}}
470
464
471
-
When we defined `if` and `while`, you probably noticed that they were more or less trivial wrappers around JavaScript's own `if` and `while`. Similarly, the values in Egg are just regular old JavaScript values. Bridging the gap to a more primitive system, such as the machine code the processor understands, is more effort—but the way it works resembles what we are doing here.
465
+
When we defined `if` and `while`, you probably noticed that they were more or less trivial wrappers around JavaScript's own `if` and `while`. Similarly, the values in Egg are just regular old JavaScript values. Bridging the gap to a more primitive system, such as the machine code the processor understands, takes more effort—but the way it works resembles what we are doing here.
472
466
473
467
Though the toy language in this chapter doesn't do anything that couldn't be done better in JavaScript, there _are_ situations where writing small languages helps get real work done.
This is what is usually called a _((domain-specific language))_, a language tailored to express a narrow domain of knowledge. Such a language can be more expressive than a general-purpose language because it is designed to describe exactly the things that need to be described in its domain, and nothing else.
489
+
This is what is usually called a _((domain-specific language))_, a language tailored to express a narrow domain of knowledge. Such a language can be more expressive than a general-purpose language because it is designed to describe exactly the things that need to be described in its domain and nothing else.
496
490
497
491
## Exercises
498
492
499
493
### Arrays
500
494
501
495
{{index "Egg language", "arrays in egg (exercise)", [array, "in Egg"]}}
502
496
503
-
Add support for arrays to Egg by adding the following three functions to the top scope: `array(...values)` to construct an array containing the argument values, `length(array)` to get an array's length, and `element(array, n)` to fetch the n^th^ element from an array.
497
+
Add support for arrays to Egg by adding the following three functions to the top scope: `array(...values)` to construct an array containing the argument values, `length(array)` to get an array's length, and `element(array, n)` to fetch the *n*th element from an array.
0 commit comments