|
| 1 | +# Binder |
| 2 | + |
| 3 | +The binder walks the tree visiting each declaration in the tree. |
| 4 | +For each declaration that it finds, it creates a `Symbol` that records its location and kind of declaration. |
| 5 | +Then it stores that symbol in a `SymbolTable` in the containing node, like a function, block or module file, that is the current scope. |
| 6 | +`Symbol`s let the checker look up names and then check their declarations to determine types. |
| 7 | +It also contains a small summary of what kind of declaration it is -- mainly whether it is a value, a type, or a namespace. |
| 8 | + |
| 9 | +Since the binder is the first tree walk before checking, it also does some other tasks: setting up the control flow graph, |
| 10 | +as well as annotating parts of the tree that will need to be downlevelled for old ES targets. |
| 11 | + |
| 12 | +Here's an example: |
| 13 | + |
| 14 | +```ts |
| 15 | +// @Filename: main.ts |
| 16 | +var x = 1 |
| 17 | +console.log(x) |
| 18 | +``` |
| 19 | + |
| 20 | +The only declaration in this program is `var x`, which is contained in the SourceFile node for `main.ts`. |
| 21 | +Functions and classes introduce new scopes, so they are containers -- at the same time as being declarations themselves. So in: |
| 22 | + |
| 23 | +```ts |
| 24 | +function f(n: number) { |
| 25 | + const m = n + 1 |
| 26 | + return m + n |
| 27 | +} |
| 28 | +``` |
| 29 | + |
| 30 | +The binder ends up with a symbol table for `f` that contains two entries: `n` and `m`. |
| 31 | +The binder finds `n` while walking the function's parameter list, and it finds `m` while walking the block that makes up `f`'s body. |
| 32 | + |
| 33 | +Both `n` and `m` are marked as values. |
| 34 | +However, there's no problem with adding another declaration for `n`: |
| 35 | + |
| 36 | +```ts |
| 37 | +function f(n: number) { |
| 38 | + type n = string |
| 39 | + const m = n + 1 |
| 40 | + return m + n |
| 41 | +} |
| 42 | +``` |
| 43 | + |
| 44 | +Now `n` has two declarations, one type and one value. |
| 45 | +The binder disallows more than one declaration of a kind of symbols with *block-scoped* declaration. |
| 46 | +Examples are `type`, `function`, `class`, `let`, `const` and parameters; *function-scoped* declarations include `var` and `interface`. |
| 47 | +But as long as the declarations are of different kinds, they're fine. |
| 48 | + |
| 49 | +## Walkthrough |
| 50 | + |
| 51 | +```ts |
| 52 | +function f(m: number) { |
| 53 | + type n = string |
| 54 | + const n = m + 1 |
| 55 | + return m + n |
| 56 | +} |
| 57 | +``` |
| 58 | + |
| 59 | +The binder's basic tree walk starts in `bind`. |
| 60 | +There, it first encounters `f` and calls `bindFunctionDeclaration` and then `bindBlockScopeDeclaration` with `SymbolFlags.Function`. |
| 61 | +This function has special cases for files and modules, but the default case calls `declareSymbol` to add a symbol in the current container. |
| 62 | +There is a lot of special-case code in `declareSymbol`, but the important path is to check whether the symbol table already contains a symbol with the name of the declaration -- `f` in this case. |
| 63 | +If not, a new symbol is created. |
| 64 | +If so, the old symbol's exclude flags are checked against the new symbol's flags. |
| 65 | +If they conflict, the binder issues an error. |
| 66 | + |
| 67 | +Finally, the new symbol's `flags` are added to the old symbol's `flags` (if any), and the new declaration is added to the symbol's `declarations` array. |
| 68 | +In addition, if the new declaration is for a value, it is set as the symbol's `valueDeclaration`. |
| 69 | + |
| 70 | +## Containers |
| 71 | + |
| 72 | +After `declareSymbol` is done, the `bind` visits the children of `f`; `f` is a container, so it calls `bindContainer` before `bindChildren`. |
| 73 | +The binder is recursive, so it pushes `f` as the new container by copying it to a local variable before walking its children. |
| 74 | +It pops `f` by copying the stored local back into `container`. |
| 75 | + |
| 76 | +The binder tracks the current lexical container as a pair of variables `container` and `blockScopedContainer` (and `thisParentContainer` if you OOP by mistake). |
| 77 | +It's implemented as a global variable managed by the binder walk, which pushes and pops containers as needed. |
| 78 | +The container's symbol table is initialised lazily, by `bindBlockScopedDeclaration`, for example. |
| 79 | + |
| 80 | +## Flags |
| 81 | + |
| 82 | +The table for which symbols may merge with each other is complicated, but it's implemented in a surprisingly small space using the bitflag enum `SymbolFlags`. |
| 83 | +The downside is that the bitflag system is very confusing. |
| 84 | + |
| 85 | +The basic rule is that a new declaration's *flags* may not conflict with the *excludes flags* of any previous declarations. |
| 86 | +Each kind of declaration has its own exclude flags; each one is a list of declaration kinds that cannot merge with that declaration. |
| 87 | + |
| 88 | +In the example above, `type n` is a type alias, which has flags = `SymbolFlags.TypeAlias` and excludeFlags = `SymbolFlags.TypeAliasExcludes`. |
| 89 | +The latter is an alias of `SymbolFlags.Type`, meaning generally that type aliases can't merge with anything that declares a type: |
| 90 | + |
| 91 | +```ts |
| 92 | +Type = Class | Interface | Enum | EnumMember | TypeLiteral | TypeParameter | TypeAlias |
| 93 | +``` |
| 94 | +Notice that this list includes `TypeAlias` itself, and declarations like classes and enums that also declare values. |
| 95 | +`Value` includes `Class` and `Enum` as well. |
| 96 | + |
| 97 | +Next, when the binder reaches `const n`, it uses the flag `BlockScopedVariable` and excludeFlags `BlockScopedVariableExcludes`. |
| 98 | +`BlockScopedVariableExcludes = Value`, which is a list of every kind of value declaration. |
| 99 | + |
| 100 | +```ts |
| 101 | +Value = Variable | Property | EnumMember | ObjectLiteral | Function | Class | Enum | ValueModule | Method | GetAccessor | SetAccessor |
| 102 | +``` |
| 103 | + |
| 104 | +`declareSymbol` looks up the existing excludeFlags for `n` and makes sure that `BlockScopedVariable` doesn't conflict; `BlockScopedVariable & Type === 0` so it doesn't. |
| 105 | +Then it *or*s the new and old flags and the new and old excludeFlags. |
| 106 | +In this example, that will prevent more value declarations because `BlockScopedVariable & (Value | Type) !== 0`. |
| 107 | + |
| 108 | +Here's some half-baked example code which shows off what you'd write if SymbolFlags used string enums and sets instead of bitflags. |
| 109 | + |
| 110 | +```ts |
| 111 | +const existing = symbolTable.get(name) |
| 112 | +const flags = SymbolFlags[declaration.kind] // eg "Function" |
| 113 | +if (existing.excludes.has(flags)) { |
| 114 | + error("Cannot redeclare", name) |
| 115 | +} |
| 116 | +existing.flags.add(flags) |
| 117 | +for (const ex of ExcludeFlags[declaration.kind]) { |
| 118 | + existing.excludeFlags.add(ex) |
| 119 | +} |
| 120 | +``` |
| 121 | + |
| 122 | +## Cross-file global merges |
| 123 | + |
| 124 | +Because the binder only binds one file at a time, the above system for merges only works with single files. |
| 125 | +For global (aka script) files, declarations can merge across files. |
| 126 | +This happens in the checker in `initializeTypeChecker`, using `mergeSymbolTable`. |
| 127 | + |
| 128 | +## Special names |
| 129 | + |
| 130 | +In `declareSymbol`, `getDeclarationName` translates certain nodes into internal names. |
| 131 | +`export=`, for example, gets translated to `InternalSymbolName.ExportEquals` |
| 132 | + |
| 133 | +Elsewhere in the binder, function expressions without names get `"__function"` |
| 134 | +Computed property names that aren't literals get `"__computed"`, manually. |
| 135 | + |
| 136 | +TODO: Finish this |
| 137 | + |
| 138 | +## Control Flow |
| 139 | + |
| 140 | +TODO: Missing completely |
| 141 | + |
| 142 | +## Emit flags |
| 143 | + |
| 144 | +TODO: Missing completely |
| 145 | + |
| 146 | +## Exports |
| 147 | + |
| 148 | +TODO: Missing completely |
| 149 | + |
| 150 | +## Javascript and CommonJS |
| 151 | + |
| 152 | +Javascript has additional types of declarations that it recognises, which fall into 3 main categories: |
| 153 | + |
| 154 | +1. Constructor functions and pre-class-field classes: assignments to `this.x` properties. |
| 155 | +2. CommonJS: assignments to `module.exports`. |
| 156 | +3. Global browser code: assignments to namespace-like object literals. |
| 157 | +4. JSDoc declarations: tags like `@type` and `@callback`. |
| 158 | + |
| 159 | +Four! Four main categories! |
| 160 | + |
| 161 | +The first three categories really aren't much different from Typescript declarations. |
| 162 | +The main complication is that not all assignments are declarations, so there's quite a bit of code that decides which assignments should be treated as declarations. |
| 163 | +The checker is fairly resilient to non-declaration assignments being included, so it's OK if the code isn't perfect. |
| 164 | + |
| 165 | +In `bindWorker`'s `BinaryExpression` case, `getAssignmentDeclarationKind` is used to decide whether an assignment matches the syntactic requirements for declarations. |
| 166 | +Then each kind of assignment dispatches to a different binding function. |
| 167 | + |
| 168 | +### Global namespace creation code |
| 169 | + |
| 170 | +In addition to CommonJS, JS also supports creating global namespaces by assignments of object literals, functions and classes to global variables. |
| 171 | +This code is very complicated and is *probably* only ever used by Closure code bases, so it might be possible to remove it someday. |
| 172 | + |
| 173 | +``` js |
| 174 | +var Namespace = {} |
| 175 | +Namespace.Mod1 = {} |
| 176 | +Namespace.Mod2 = function () { |
| 177 | + // callable module! |
| 178 | +} |
| 179 | +Namespace.Mod2.Sub1 = { |
| 180 | + // actual contents |
| 181 | +} |
| 182 | +``` |
| 183 | + |
| 184 | +TODO: This is unfinished. |
| 185 | + |
| 186 | +### JSDoc declarations |
| 187 | + |
| 188 | +TODO: This is unfinished. |
| 189 | + |
| 190 | +### Conflicting object literal export assignments |
| 191 | + |
| 192 | +One particuarly complex case of CommonJS binding occurs when there is an object literal export assignment in the same module as `module.exports` assignments: |
| 193 | + |
| 194 | +```js |
| 195 | +module.exports = { |
| 196 | + foo: function() { return 1 }, |
| 197 | + bar: function() { return 'bar' }, |
| 198 | + baz: 12, |
| 199 | +} |
| 200 | +if (isWindows) { |
| 201 | + // override 'foo' with Windows-specific version |
| 202 | + module.exports.foo = function () { return 11 } |
| 203 | +} |
| 204 | +``` |
| 205 | + |
| 206 | +In this case, the desired exports of the file are `foo, bar, baz`. |
| 207 | +Even though `foo` is declared twice, it should have one export with two declarations. |
| 208 | +The type should be `() => number`, though that's the responsibility of the checker. |
| 209 | + |
| 210 | +In fact, this structure is too complicated to build in the binder, so the checker produces it through merges, using the same merge infrastructure it uses for cross-file global merges. |
| 211 | +The binder treats this pretty straightforwardly; it calls `bindModuleExportsAssignment` for `module.exports = {...`, which creates a single `export=` export. |
| 212 | +Then it calls `bindExportsPropertyAssignment` for `module.exports.foo = ...`, which creates a `foo` export. |
| 213 | + |
| 214 | +Having `export=` with other exports is impossible with ES module syntax, so the checker detects it and copies all the top-level exports into the `export=`. |
| 215 | +In the checker, `resolveExternalModuleSymbol` returns either an entire module's exports, or all the exports in an `export=`. |
| 216 | +In the combined CommonJS case we're discussing, `getCommonJsExportEquals` also checks whether a module has exports *and* `export=`. |
| 217 | +If it does, it copies each of the top-level exports into the `export=`. |
| 218 | +If a property with the same name already exists in the `export=`, the two are merged with `mergeSymbol`. |
| 219 | + |
| 220 | +Subsequent code in the checker that doesn't use `resolveExternalModuleSymbol` (is there any?) has to ignore the `export=`, since its contents are now just part of the module. |
0 commit comments