Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the legacy and current proposals mutually exclusive intra-module #344

Open
syg opened this issue Mar 20, 2025 · 17 comments
Open

Make the legacy and current proposals mutually exclusive intra-module #344

syg opened this issue Mar 20, 2025 · 17 comments

Comments

@syg
Copy link

syg commented Mar 20, 2025

Since the legacy proposal is deprecated, we should minimize likelihood of prolonging its life in the ecosystem. Within the same wasm module, use of the new proposal should be mutually exclusive with use of the old proposal (i.e. should not validate).

For users concatenating together multiple Wasm modules, they'd need to be mindful to transpile legacy EH stuff to exnref. That's a good thing, so this would be a stick to incentivize that.

@kmiller68
Copy link

I'm for this change too. If I understand correctly, binaryen already has a feature to do the wasm -> wasm transform to convert legacy exceptions to exnref (https://github.com/WebAssembly/binaryen/blob/main/src/passes/TranslateEH.cpp). It seems only mildly inconvenient to use that as part of the build/bundling process for most apps today.

Given the history of the web, there's a real possibility that allowing mixed use makes it shockingly likely that legacy exceptions will remain forever. There's a huge list of "deprecated" features that end up surviving because they were too permissively allowed.

@titzer
Copy link
Contributor

titzer commented Mar 21, 2025

I like this idea as well. We definitely need to make sure the deprecation plan succeeds.

@sbc100
Copy link
Member

sbc100 commented Mar 21, 2025

Sounds reasonable, but note that "concatenating together multiple Wasm modules" is not really a thing that I'm aware that folks do today. At least the common web-facing toolchain like emscripten and rust don't do this and I don't think bundlers do it either.

@tlively
Copy link
Member

tlively commented Mar 21, 2025

Given the history of the web, there's a real possibility that allowing mixed use makes it shockingly likely that legacy exceptions will remain forever.

@kmiller68, can you explain why mixed use seems worse than unmixed use? I also want to +1 @sbc100's point that mixed use (in the same module) is unlikely to occur without new developments from toolchains.

Disallowing use in the same module also does not mean that legacy and standard EH could never interact. One module could throw, catch, and rethrow a legacy exception that could be caught in another module using new EH or vice versa. Testing and fuzzing that the interaction is correct and safe is much easier if the throwing and catching can be in the same module.

My overall view is that this restriction would not meaningfully move deprecation forward, so it would just be a self-imposed annoyance with no benefit.

@kmiller68
Copy link

@kmiller68, can you explain why mixed use seems worse than unmixed use? I also want to +1 @sbc100's point that mixed use (in the same module) is unlikely to occur without new developments from toolchains.

Disallowing use in the same module also does not mean that legacy and standard EH could never interact. One module could throw, catch, and rethrow a legacy exception that could be caught in another module using new EH or vice versa. Testing and fuzzing that the interaction is correct and safe is much easier if the throwing and catching can be in the same module.

I think it's mostly that there is almost zero cost on the browser side (at least for us) to verify you don't do both. I agree this doesn't solve all interactions between legacy and new EH but I disagree it makes testing any harder. Browsers should be testing/fuzzing inter-module interactions just as much as intra-module, bugs can certainly exist inter-module that don't exist within a single instance (I've certainly written them). Given, as you said, there's no apps/toolchains using both today, is this even a burden to them?

IMO, it seems like the cost benefit trade off is something like:

  • VM short term
    • Cost:
      • Slightly harder to test interactions
      • Add some extra logic to do !(newEH && oldEH) (~10-20 lines of code?)
    • Benefit: None
  • VM long term
    • Cost: None
    • Benefit: Much harder for some bespoke tool to decide they like mixing EH versions preventing deprecation. *
  • Toolchains short term
    • Cost: Need to use Binaryen to remove legacy EH (or reimplement the pass)
    • Benefit: None
  • Toolchains long term
    • Cost: None
    • Benefit: None

I'm happy on the VM side to eat those costs for the long term potential benefit of removing legacy EH support. Even if the probability of some toolchain using both EH proposals is low, the maintenance cost of supporting both forever is worth it (especially as wasm's complexity matrix gets bigger). On the other hand, as you said, no toolchain is mixing today so the cost to them seems like it's effectively zero.

  • e.g for code size reasons. I don't know if you'd actually save code by using old EH vs new EH in different contexts but maybe there's cases?

@tlively
Copy link
Member

tlively commented Mar 21, 2025

Ok, but even if you consider the costs to be negligible, I still think the benefit is also negligible. And in the unfortunate but possible future where we give up on trying to unship legacy EH, this mutual exclusion behavior would be pure downside for future bundlers that are able to merge Wasm modules.

@fgmccabe
Copy link

@tlively Even if the current generation 'gives up' on trying to deprecate the old EH, a follow-on generation might want to 'try again'.

@syg
Copy link
Author

syg commented Mar 21, 2025

The benefit isn't negligible! The potential upside is it might move the needle on unshipping legacy EH, which is non-neglible. Of course it isn't guaranteed, but the expected value seems very positive to me.

And in the unfortunate but possible future where we give up on trying to unship legacy EH, this mutual exclusion behavior would be pure downside for future bundlers that are able to merge Wasm modules.

It just doesn't follow that if you can't unship a feature, you then also undeprecate it and embrace its use. Historically there are many, many features on the web platform we can't unship and grudgingly live with. It doesn't mean that we try to make it easy for them to be used. The opposite, in fact! We try to build new carrots to tempt people away from their use, add warnings, etc.

@kmiller68
Copy link

Ok, but even if you consider the costs to be negligible, I still think the benefit is also negligible.

I think we disagree on this part. And I agree with @syg that it's non-negligible.

And in the unfortunate but possible future where we give up on trying to unship legacy EH, this mutual exclusion behavior would be pure downside for future bundlers that are able to merge Wasm modules.

If we give up on deprecating legacy EH we could always reverse this choice and allow both. That would almost certainly be web compatible (unless we repurpose the legacy ops for something else but that seems very unlikely before full deprecation).

@eqrion
Copy link
Contributor

eqrion commented Mar 24, 2025

I'm sympathetic to doing whatever we can to discourage legacy EH usage. I'm not convinced this would move the needle much, but it's possible it could help.

Implementing this in SpiderMonkey would be fairly easy, it'd re-use most of the stuff we have for use counters. The one difficulty would be if we needed to have precise error offsets when the usage comes from different functions that are compiled in parallel. In that case we'd probably need to report the error offset as pointing at the beginning of a function or something like that.

@tlively
Copy link
Member

tlively commented Mar 24, 2025

After thinking about this more, my objection to this change is purely that it will not be effective in helping deprecation of legacy EH. I still believe that there is almost no chance that this proposed error would ever be thrown because no toolchain will produce modules mixing legacy and standard EH. To do so, they would have to either produce mixed usage from the start, which is implausible because toolchain authors know legacy EH is deprecated, or they would have to merge modules that use both EH systems. To my knowledge, no toolchain merges modules like this today, and if they did, they would likely use Binaryen's wasm-merge to do it. We are in control of Binaryen and can make wasm-merge automatically translate legacy to standard EH, so these toolchains would have to use a non-Binaryen merge tool to encounter this error. And they would have to do this in the period of time before the deprecation succeeds or fails for it to be helpful. The chances of this all coming together to make this change useful aren't zero, but it seems like an awfully big stretch. I don't think it's sufficient motivation to change the status quo.

Instead, I would propose that we consider a much larger change that actually has a chance of moving the needle on deprecation. Specifically, we can make it so that a WebAssembly store (i.e. a JS agent in the JS embedding) exclusively supports legacy EH or any new feature we add to WebAssembly from now on, including standard EH. Once a legacy EH instruction is validated in the store, no module using any new Wasm features will pass validation. In the other direction, once any new Wasm features are validated, no module using legacy EH would validate.

The carrot is much sweeter here, and gets sweeter over time; the payoff for ditching old EH is the ability to use all new WebAssembly features we have shipped since EH. The stick is also more imposing, since it's not limited to intra-module mixing. The immediate workaround for developers that encounter the errors is the same: remove the use of old EH either by recompiling or by translation. The difference is that it is actually plausible that developers might encounter the error. If that seems concerning, remember that if nobody is going to see the error, then the error is not helping us move deprecation along.

@rossberg
Copy link
Member

@tlively, I like that.

@eqrion
Copy link
Contributor

eqrion commented Mar 25, 2025

Instead, I would propose that we consider a much larger change that actually has a chance of moving the needle on deprecation. Specifically, we can make it so that a WebAssembly store (i.e. a JS agent in the JS embedding) exclusively supports legacy EH or any new feature we add to WebAssembly from now on, including standard EH. Once a legacy EH instruction is validated in the store, no module using any new Wasm features will pass validation. In the other direction, once any new Wasm features are validated, no module using legacy EH would validate.

@tlively I forget how a JS agent relates to HTML concepts. If you were to embed an iframe (same-site or cross-origin) on a page and that iframe uses legacy EH, would that conflict with the top-level document?

I agree that for this to be effective, developers actually need to run into this in practice, and therefore be motivated to upgrade. But if you're a developer independently using two unrelated scripts that independently upgrade to use conflicting features, you're not in a place where you can actually fix it. The issue is happening upstream and you only have very indirect control over that. Even more so if the problematic code is running in an iframe that you don't entirely control.

Instead of doing this 'feature freeze' on the whole store, could we instead limit it to within a single module?

@dschuff
Copy link
Member

dschuff commented Mar 25, 2025

+1 to @eqrion, I like the idea of expanding this exclusivity generally, but we'd have to make sure the developers this affects have control over the situation. Annoying cross-origin issues have already resulted in a lot of headaches (and kept some users from using threads), and I'd hate to see a repeat of that.

@tlively
Copy link
Member

tlively commented Mar 25, 2025

@tlively I forget how a JS agent relates to HTML concepts.

TBH, I don't know the details, but the JS spec starts with "Each agent has an associated store." The linked ECMAScript section doesn't really clear things up, but it does have this note:

An agent is a specification mechanism and need not correspond to any particular artefact of an ECMAScript implementation.

So maybe making the boundaries of an agent semantically visible via these errors is not such a great idea. Disallowing legacy EH and new features mixing only in the same module would avoid this problem, so that's another reason to prefer that approach. I'd be fine with this, although in the short term it will be equivalent to the original proposed change and therefore I don't believe it will be much help.

@tlively
Copy link
Member

tlively commented Mar 25, 2025

+1 to @eqrion, I like the idea of expanding this exclusivity generally, but we'd have to make sure the developers this affects have control over the situation. Annoying cross-origin issues have already resulted in a lot of headaches (and kept some users from using threads), and I'd hate to see a repeat of that.

In principle the developer could run wasm-opt to translate away the legacy EH, even for upstream Wasm modules. Perhaps the error message could even point to this solution. In practice I can see several reasons why this wouldn't be sufficient. The Wasm module might be buried deep in some upstream dependency that the developer doesn't have visibility into, or it might be a hard-coded UInt8 array in a JS file.

@syg
Copy link
Author

syg commented Mar 25, 2025

Yeah, I'd probably avoid tying the error to agent boundaries if we can. IMO developers should only care about agents and agent clusters via shared memory access, not other features.

I forget how a JS agent relates to HTML concepts.

An agent is basically an idealized thread. Same-site, same-origin iframes must get the same agent. I think same-site, cross-origin iframes get a different agent per spec, but this doesn't manifest as an actual separate hw thread running in parallel in some browsers. Add Origin-Agent-Cluster, and then same-site, cross-origin iframes actually get a different process (= agent cluster).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants