Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of runtime representation #6077

Closed
cristianoc opened this issue Mar 16, 2023 · 5 comments
Closed

Status of runtime representation #6077

cristianoc opened this issue Mar 16, 2023 · 5 comments
Milestone

Comments

@cristianoc
Copy link
Collaborator

cristianoc commented Mar 16, 2023

This issue describes the current status of the runtime representation of ReScript constructs in JS.
Most constructs map cleanly to JS with some exceptions:

  1. variants e.g. D(1,2) are represented as {TAG: /* D */1, _0: 1, _1: 2}.
  2. curried functions e.g. a type (int, int)=>int could be a function with 2 args, or a function with 1 arg returning a function with 1 arg.

One implication is that these representations can be cumbersome to use from JS.
Another implication is the type correspondence with genType, where in order to get a clean mapping, a runtime conversion needs to be performed. Sometimes that conversion is simply missing even with genType, e.g. in recursive types with variants where an appropriate recursive function would need to be generated. (See issues about genType and linked lists).

If one could have clean representations, then FFI would be simpler, and typed FFI with genType would require no runtime whatsoever #6099. A further implication is that one could try to move towards just generating .d.ts files, which fit much more naturally in a TS project.

The obstacles to cleaning up the remaining cases are the following:

switch x {
  | A => 10
  | B | C | D => 0
  | E => 10
  }

generates

if (x > 3 || x < 1) {
    return 10;
  } else {
    return 0;
  }

as it relies on the fact that tags are numeric values (and is not obvious to the user how it maps to the source program).

cristianoc added a commit that referenced this issue Mar 19, 2023
There are several ways in which the compilation of variants relies on tags being integers:
- Sometimes it uses booleans `if x` to mean `x !=0` to mean: not the first variant (i.e. tag 0)
- Sometimes it uses intervals `if (x > 3 || x < 1)`

Care is required not to change the compilation of variants with special compilation:
- true and false
- lists have constructors "[]" and "::"
- options have a specific definition for  "Some" and "None"

See #6077
@glennsl
Copy link
Contributor

glennsl commented Mar 19, 2023

It's unclear to me what a clean representation of variants would be, and what tags could be if not ints. Could you expand a bit on this @cristianoc?

It also seems that an implication of this might be that pattern matching would be less optimized?

@zth
Copy link
Collaborator

zth commented Mar 21, 2023

One of the ideas here is that this:

type stuff = User({name: string, age: int}) | Pet({age: int})

let user = User({name: "Hello", age: 35})

...which now compiles to this:

var user = {
  TAG: /* User */0,
  name: "Hello",
  age: 35
};

...could instead compile to this:

var user = {
  kind: "User",
  name: "Hello",
  age: 35
};

We call it "kind" here, but it could easily be called something else (maybe still TAG by defaullt), but more notably it could be configurable. This would allow for:

  1. Zero cost bindings to tagged unions in JS/TS. This would make a lot of APIs much more accessible. Any AST-like structure represented in JS comes to mind for example, where you now need to do (potentially deep) runtime conversion to use it idiomatically from ReScript. Tagged unions are getting more and more popular in TS, and having a zero cost way to map to those would be very useful.
  2. Debugging. As of now, the tag-name is in a comment in the source, and that's good. But that comment isn't visible when you log the actual value, so you're still confronted with a TAG: 1, which you'll need to go looking in your compiled sources to understand what it is.

This would likely also open up being able to configure the runtime representation of payload-less variants, like:

type fetchPolicy = | @as("store-or-network") StoreOrNetwork | @as("store-only") Store

Also quite important, because right now we're implicitly pushing people towards using polyvariants when they need control over the runtime representation for zero cost bindings. But, a polyvariant is typically the wrong choice (structural, can't add documentation, worse error messages, etc), and most users would be better off using regular variants, but can't/won't because they'd need to do runtime conversion.

cristianoc added a commit that referenced this issue Mar 21, 2023
There are several ways in which the compilation of variants relies on tags being integers:
- Sometimes it uses booleans `if x` to mean `x !=0` to mean: not the first variant (i.e. tag 0)
- Sometimes it uses intervals `if (x > 3 || x < 1)`

Care is required not to change the compilation of variants with special compilation:
- true and false
- lists have constructors "[]" and "::"
- options have a specific definition for  "Some" and "None"

See #6077
@glennsl
Copy link
Contributor

glennsl commented Mar 21, 2023

Thank for the explanation @zth! I think this makes a lot of sense for all the stated reasons. I do still worry a little bit about the performance implications though. Even if rare, it would be nice to be able to opt in to the old representation and optimizing pattern match compiler.

@cristianoc cristianoc added this to the v11.0 milestone Apr 2, 2023
@Mepy
Copy link

Mepy commented Apr 3, 2023

For variants, D(1, 2) being represented as [1/* D */, 1, 2] might be shorter for serialization, e.g. JSON.

[1, 1, 2]
{TAG: 1, _0: 1, _1: 2}

Considering between readability and efficiency, I agree with the configurable idea, just like the @as decorator, but extending it for variant :

type node = {@as("0") width : int , @as("1") height : int}
// extention for variant : 
// type node = @variant-as("array") | C(int) | D(int, int)

As for genType, in my opinion, generating .d.ts might be better. In deed, I usually use @genType.opaque to generate only types in .gen.ts and import actual implementation code from .bs.js.
Noticing the genType for variant TAG:

// .res
type node = | C(int) | D(int, int)
// gen.ts
export type node = 
    { tag: "C"; value: number }
  | { tag: "D"; value: [number, number] };

Instead, we can use the enum in TS for TAG

// .gen.ts
enum node_TAG {
  C = 0,
  D = 1,
}
type encode = 0 | 1
type node = [encode, number] | [encode, number, number]
// TypeScript side elimination for variants
const encode = node.shift()
switch encode {
case node_TAG.C :
       const val = node[0]
       // ...
       break
case node_TAG.D : 
       const val0 = node[0]
       const val1 = node[1]
       // ...
       break
}

@cristianoc
Copy link
Collaborator Author

Keep in mind the goal here is to interface with TypeScript and existing APIs. So even though an array might be more efficient, the APIs exist already, and the best one can do is to make it easy to consume them.

That said, the untagged variants proposal adds one more way to control the runtime representation.
E.g. D((1,2)) unboxed is exactly [1,2].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants