Skip to content

Commit cca55dd

Browse files
committedSep 2, 2022
Port codebase/ from typescript-compiler-notes
Doesn't include the pngs from codebase/screenshots/ for the services pages. That still needs to be added and the links updated.
1 parent 5e0ae35 commit cca55dd

13 files changed

+1545
-0
lines changed
 
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
# Binder
2+
3+
The binder walks the tree visiting each declaration in the tree.
4+
For each declaration that it finds, it creates a `Symbol` that records its location and kind of declaration.
5+
Then it stores that symbol in a `SymbolTable` in the containing node, like a function, block or module file, that is the current scope.
6+
`Symbol`s let the checker look up names and then check their declarations to determine types.
7+
It also contains a small summary of what kind of declaration it is -- mainly whether it is a value, a type, or a namespace.
8+
9+
Since the binder is the first tree walk before checking, it also does some other tasks: setting up the control flow graph,
10+
as well as annotating parts of the tree that will need to be downlevelled for old ES targets.
11+
12+
Here's an example:
13+
14+
```ts
15+
// @Filename: main.ts
16+
var x = 1
17+
console.log(x)
18+
```
19+
20+
The only declaration in this program is `var x`, which is contained in the SourceFile node for `main.ts`.
21+
Functions and classes introduce new scopes, so they are containers -- at the same time as being declarations themselves. So in:
22+
23+
```ts
24+
function f(n: number) {
25+
const m = n + 1
26+
return m + n
27+
}
28+
```
29+
30+
The binder ends up with a symbol table for `f` that contains two entries: `n` and `m`.
31+
The binder finds `n` while walking the function's parameter list, and it finds `m` while walking the block that makes up `f`'s body.
32+
33+
Both `n` and `m` are marked as values.
34+
However, there's no problem with adding another declaration for `n`:
35+
36+
```ts
37+
function f(n: number) {
38+
type n = string
39+
const m = n + 1
40+
return m + n
41+
}
42+
```
43+
44+
Now `n` has two declarations, one type and one value.
45+
The binder disallows more than one declaration of a kind of symbols with *block-scoped* declaration.
46+
Examples are `type`, `function`, `class`, `let`, `const` and parameters; *function-scoped* declarations include `var` and `interface`.
47+
But as long as the declarations are of different kinds, they're fine.
48+
49+
## Walkthrough
50+
51+
```ts
52+
function f(m: number) {
53+
type n = string
54+
const n = m + 1
55+
return m + n
56+
}
57+
```
58+
59+
The binder's basic tree walk starts in `bind`.
60+
There, it first encounters `f` and calls `bindFunctionDeclaration` and then `bindBlockScopeDeclaration` with `SymbolFlags.Function`.
61+
This function has special cases for files and modules, but the default case calls `declareSymbol` to add a symbol in the current container.
62+
There is a lot of special-case code in `declareSymbol`, but the important path is to check whether the symbol table already contains a symbol with the name of the declaration -- `f` in this case.
63+
If not, a new symbol is created.
64+
If so, the old symbol's exclude flags are checked against the new symbol's flags.
65+
If they conflict, the binder issues an error.
66+
67+
Finally, the new symbol's `flags` are added to the old symbol's `flags` (if any), and the new declaration is added to the symbol's `declarations` array.
68+
In addition, if the new declaration is for a value, it is set as the symbol's `valueDeclaration`.
69+
70+
## Containers
71+
72+
After `declareSymbol` is done, the `bind` visits the children of `f`; `f` is a container, so it calls `bindContainer` before `bindChildren`.
73+
The binder is recursive, so it pushes `f` as the new container by copying it to a local variable before walking its children.
74+
It pops `f` by copying the stored local back into `container`.
75+
76+
The binder tracks the current lexical container as a pair of variables `container` and `blockScopedContainer` (and `thisParentContainer` if you OOP by mistake).
77+
It's implemented as a global variable managed by the binder walk, which pushes and pops containers as needed.
78+
The container's symbol table is initialised lazily, by `bindBlockScopedDeclaration`, for example.
79+
80+
## Flags
81+
82+
The table for which symbols may merge with each other is complicated, but it's implemented in a surprisingly small space using the bitflag enum `SymbolFlags`.
83+
The downside is that the bitflag system is very confusing.
84+
85+
The basic rule is that a new declaration's *flags* may not conflict with the *excludes flags* of any previous declarations.
86+
Each kind of declaration has its own exclude flags; each one is a list of declaration kinds that cannot merge with that declaration.
87+
88+
In the example above, `type n` is a type alias, which has flags = `SymbolFlags.TypeAlias` and excludeFlags = `SymbolFlags.TypeAliasExcludes`.
89+
The latter is an alias of `SymbolFlags.Type`, meaning generally that type aliases can't merge with anything that declares a type:
90+
91+
```ts
92+
Type = Class | Interface | Enum | EnumMember | TypeLiteral | TypeParameter | TypeAlias
93+
```
94+
Notice that this list includes `TypeAlias` itself, and declarations like classes and enums that also declare values.
95+
`Value` includes `Class` and `Enum` as well.
96+
97+
Next, when the binder reaches `const n`, it uses the flag `BlockScopedVariable` and excludeFlags `BlockScopedVariableExcludes`.
98+
`BlockScopedVariableExcludes = Value`, which is a list of every kind of value declaration.
99+
100+
```ts
101+
Value = Variable | Property | EnumMember | ObjectLiteral | Function | Class | Enum | ValueModule | Method | GetAccessor | SetAccessor
102+
```
103+
104+
`declareSymbol` looks up the existing excludeFlags for `n` and makes sure that `BlockScopedVariable` doesn't conflict; `BlockScopedVariable & Type === 0` so it doesn't.
105+
Then it *or*s the new and old flags and the new and old excludeFlags.
106+
In this example, that will prevent more value declarations because `BlockScopedVariable & (Value | Type) !== 0`.
107+
108+
Here's some half-baked example code which shows off what you'd write if SymbolFlags used string enums and sets instead of bitflags.
109+
110+
```ts
111+
const existing = symbolTable.get(name)
112+
const flags = SymbolFlags[declaration.kind] // eg "Function"
113+
if (existing.excludes.has(flags)) {
114+
error("Cannot redeclare", name)
115+
}
116+
existing.flags.add(flags)
117+
for (const ex of ExcludeFlags[declaration.kind]) {
118+
existing.excludeFlags.add(ex)
119+
}
120+
```
121+
122+
## Cross-file global merges
123+
124+
Because the binder only binds one file at a time, the above system for merges only works with single files.
125+
For global (aka script) files, declarations can merge across files.
126+
This happens in the checker in `initializeTypeChecker`, using `mergeSymbolTable`.
127+
128+
## Special names
129+
130+
In `declareSymbol`, `getDeclarationName` translates certain nodes into internal names.
131+
`export=`, for example, gets translated to `InternalSymbolName.ExportEquals`
132+
133+
Elsewhere in the binder, function expressions without names get `"__function"`
134+
Computed property names that aren't literals get `"__computed"`, manually.
135+
136+
TODO: Finish this
137+
138+
## Control Flow
139+
140+
TODO: Missing completely
141+
142+
## Emit flags
143+
144+
TODO: Missing completely
145+
146+
## Exports
147+
148+
TODO: Missing completely
149+
150+
## Javascript and CommonJS
151+
152+
Javascript has additional types of declarations that it recognises, which fall into 3 main categories:
153+
154+
1. Constructor functions and pre-class-field classes: assignments to `this.x` properties.
155+
2. CommonJS: assignments to `module.exports`.
156+
3. Global browser code: assignments to namespace-like object literals.
157+
4. JSDoc declarations: tags like `@type` and `@callback`.
158+
159+
Four! Four main categories!
160+
161+
The first three categories really aren't much different from Typescript declarations.
162+
The main complication is that not all assignments are declarations, so there's quite a bit of code that decides which assignments should be treated as declarations.
163+
The checker is fairly resilient to non-declaration assignments being included, so it's OK if the code isn't perfect.
164+
165+
In `bindWorker`'s `BinaryExpression` case, `getAssignmentDeclarationKind` is used to decide whether an assignment matches the syntactic requirements for declarations.
166+
Then each kind of assignment dispatches to a different binding function.
167+
168+
### Global namespace creation code
169+
170+
In addition to CommonJS, JS also supports creating global namespaces by assignments of object literals, functions and classes to global variables.
171+
This code is very complicated and is *probably* only ever used by Closure code bases, so it might be possible to remove it someday.
172+
173+
``` js
174+
var Namespace = {}
175+
Namespace.Mod1 = {}
176+
Namespace.Mod2 = function () {
177+
// callable module!
178+
}
179+
Namespace.Mod2.Sub1 = {
180+
// actual contents
181+
}
182+
```
183+
184+
TODO: This is unfinished.
185+
186+
### JSDoc declarations
187+
188+
TODO: This is unfinished.
189+
190+
### Conflicting object literal export assignments
191+
192+
One particuarly complex case of CommonJS binding occurs when there is an object literal export assignment in the same module as `module.exports` assignments:
193+
194+
```js
195+
module.exports = {
196+
foo: function() { return 1 },
197+
bar: function() { return 'bar' },
198+
baz: 12,
199+
}
200+
if (isWindows) {
201+
// override 'foo' with Windows-specific version
202+
module.exports.foo = function () { return 11 }
203+
}
204+
```
205+
206+
In this case, the desired exports of the file are `foo, bar, baz`.
207+
Even though `foo` is declared twice, it should have one export with two declarations.
208+
The type should be `() => number`, though that's the responsibility of the checker.
209+
210+
In fact, this structure is too complicated to build in the binder, so the checker produces it through merges, using the same merge infrastructure it uses for cross-file global merges.
211+
The binder treats this pretty straightforwardly; it calls `bindModuleExportsAssignment` for `module.exports = {...`, which creates a single `export=` export.
212+
Then it calls `bindExportsPropertyAssignment` for `module.exports.foo = ...`, which creates a `foo` export.
213+
214+
Having `export=` with other exports is impossible with ES module syntax, so the checker detects it and copies all the top-level exports into the `export=`.
215+
In the checker, `resolveExternalModuleSymbol` returns either an entire module's exports, or all the exports in an `export=`.
216+
In the combined CommonJS case we're discussing, `getCommonJsExportEquals` also checks whether a module has exports *and* `export=`.
217+
If it does, it copies each of the top-level exports into the `export=`.
218+
If a property with the same name already exists in the `export=`, the two are merged with `mergeSymbol`.
219+
220+
Subsequent code in the checker that doesn't use `resolveExternalModuleSymbol` (is there any?) has to ignore the `export=`, since its contents are now just part of the module.

0 commit comments

Comments
 (0)