Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssemblyLowerEmscriptenEHSjLj] Avoid setting import_name where possible #128564

Merged
merged 1 commit into from
Feb 26, 2025

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Feb 24, 2025

This change effectively reverts 296ccef (https://reviews.llvm.org/D77192)

Most of these symbols are just normal C symbols that get imported from wither libcompiler-rt or from emscripten's JS library code. In most cases it should not be necessary to give them explicit import names.

The advantage of doing this is that we can wasm-ld can/will fail with a useful error message when these symbols are missing. As opposed to today where it will simply import them and defer errors until later (when they are less specific).

@llvmbot
Copy link
Member

llvmbot commented Feb 24, 2025

@llvm/pr-subscribers-backend-webassembly

Author: Sam Clegg (sbc100)

Changes

Most of these symbols are just normal C symbols that get imported from wither libcompiler-rt or from emscripten's JS library code. In most cases it should not be necessary to give them explicit import names.

The advantage of doing this is that we can wasm-ld can/will fail with a useful error message when these symbols are missing. As opposed to today where it will simply import them and defer errors until later (when they are less specific).


Full diff: https://github.com/llvm/llvm-project/pull/128564.diff

1 Files Affected:

  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (+17-12)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
index c60cf69c30104..6eb61bd4b3159 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
@@ -440,9 +440,14 @@ static std::string getSignature(FunctionType *FTy) {
   return Sig;
 }
 
+static Function *getFunction(FunctionType *Ty, const Twine &Name,
+                                       Module *M) {
+  return Function::Create(Ty, GlobalValue::ExternalLinkage, Name, M);
+}
+
 static Function *getEmscriptenFunction(FunctionType *Ty, const Twine &Name,
                                        Module *M) {
-  Function* F = Function::Create(Ty, GlobalValue::ExternalLinkage, Name, M);
+  Function* F = getFunction(Ty, Name, M);
   // Tell the linker that this function is expected to be imported from the
   // 'env' module.
   if (!F->hasFnAttribute("wasm-import-module")) {
@@ -927,11 +932,11 @@ bool WebAssemblyLowerEmscriptenEHSjLj::runOnModule(Module &M) {
   // exception handling and setjmp/longjmp handling
   ThrewGV = getGlobalVariable(M, getAddrIntType(&M), TM, "__THREW__");
   ThrewValueGV = getGlobalVariable(M, IRB.getInt32Ty(), TM, "__threwValue");
-  GetTempRet0F = getEmscriptenFunction(
-      FunctionType::get(IRB.getInt32Ty(), false), "getTempRet0", &M);
-  SetTempRet0F = getEmscriptenFunction(
-      FunctionType::get(IRB.getVoidTy(), IRB.getInt32Ty(), false),
-      "setTempRet0", &M);
+  GetTempRet0F = getFunction(FunctionType::get(IRB.getInt32Ty(), false),
+                             "getTempRet0", &M);
+  SetTempRet0F =
+      getFunction(FunctionType::get(IRB.getVoidTy(), IRB.getInt32Ty(), false),
+                  "setTempRet0", &M);
   GetTempRet0F->setDoesNotThrow();
   SetTempRet0F->setDoesNotThrow();
 
@@ -942,13 +947,13 @@ bool WebAssemblyLowerEmscriptenEHSjLj::runOnModule(Module &M) {
     // Register __resumeException function
     FunctionType *ResumeFTy =
         FunctionType::get(IRB.getVoidTy(), IRB.getPtrTy(), false);
-    ResumeF = getEmscriptenFunction(ResumeFTy, "__resumeException", &M);
+    ResumeF = getFunction(ResumeFTy, "__resumeException", &M);
     ResumeF->addFnAttr(Attribute::NoReturn);
 
     // Register llvm_eh_typeid_for function
     FunctionType *EHTypeIDTy =
         FunctionType::get(IRB.getInt32Ty(), IRB.getPtrTy(), false);
-    EHTypeIDF = getEmscriptenFunction(EHTypeIDTy, "llvm_eh_typeid_for", &M);
+    EHTypeIDF = getFunction(EHTypeIDTy, "llvm_eh_typeid_for", &M);
   }
 
   // Functions that contains calls to setjmp but don't have other longjmpable
@@ -988,14 +993,14 @@ bool WebAssemblyLowerEmscriptenEHSjLj::runOnModule(Module &M) {
       // Register emscripten_longjmp function
       FunctionType *FTy = FunctionType::get(
           IRB.getVoidTy(), {getAddrIntType(&M), IRB.getInt32Ty()}, false);
-      EmLongjmpF = getEmscriptenFunction(FTy, "emscripten_longjmp", &M);
+      EmLongjmpF = getFunction(FTy, "emscripten_longjmp", &M);
       EmLongjmpF->addFnAttr(Attribute::NoReturn);
     } else { // EnableWasmSjLj
       Type *Int8PtrTy = IRB.getPtrTy();
       // Register __wasm_longjmp function, which calls __builtin_wasm_longjmp.
       FunctionType *FTy = FunctionType::get(
           IRB.getVoidTy(), {Int8PtrTy, IRB.getInt32Ty()}, false);
-      WasmLongjmpF = getEmscriptenFunction(FTy, "__wasm_longjmp", &M);
+      WasmLongjmpF = getFunction(FTy, "__wasm_longjmp", &M);
       WasmLongjmpF->addFnAttr(Attribute::NoReturn);
     }
 
@@ -1009,11 +1014,11 @@ bool WebAssemblyLowerEmscriptenEHSjLj::runOnModule(Module &M) {
       FunctionType *FTy = FunctionType::get(
           IRB.getVoidTy(), {SetjmpFTy->getParamType(0), Int32Ty, Int32PtrTy},
           false);
-      WasmSetjmpF = getEmscriptenFunction(FTy, "__wasm_setjmp", &M);
+      WasmSetjmpF = getFunction(FTy, "__wasm_setjmp", &M);
 
       // Register __wasm_setjmp_test function
       FTy = FunctionType::get(Int32Ty, {Int32PtrTy, Int32PtrTy}, false);
-      WasmSetjmpTestF = getEmscriptenFunction(FTy, "__wasm_setjmp_test", &M);
+      WasmSetjmpTestF = getFunction(FTy, "__wasm_setjmp_test", &M);
 
       // wasm.catch() will be lowered down to wasm 'catch' instruction in
       // instruction selection.

@sbc100 sbc100 requested a review from aheejin February 24, 2025 19:54
Copy link

github-actions bot commented Feb 24, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@aheejin
Copy link
Member

aheejin commented Feb 25, 2025

Does the rationale in 296ccef not apply anymore? I guess I can understand why it's unnecessary in case of C symbol + static linking but

  • Is it fine in case of C symbols with dynamic linking?
  • Is it fine in case of JS symbols with static/dynamic linking?

Now we have only two functions that are labeled with wasm-import-module: invokes and __cxa_matching_catchs. Are they different from other JS functions?

@sbc100
Copy link
Collaborator Author

sbc100 commented Feb 25, 2025

Does the rationale in 296ccef not apply anymore? I guess I can understand why it's unnecessary in case of C symbol + static linking but

  • Is it fine in case of C symbols with dynamic linking?
  • Is it fine in case of JS symbols with static/dynamic linking?

Now we have only two functions that are labeled with wasm-import-module: invokes and __cxa_matching_catchs. Are they different from other JS functions?

Yes the invoke_xxx and __cxa_matching_catch_xxx function are still explicitly marked with import-name and import-module.

This is because their names are dynamic and we have no way for the static linker to declare their existence in a JS library (at least not without enumerating all of them somehow, which seems hard/impossible).

One way of thinking about this is that explicitly marking these symbols are imported is something we should avoid, if we can. But for dynamic symbol names I don't know of any way to avoid it.

@sbc100
Copy link
Collaborator Author

sbc100 commented Feb 25, 2025

Thanks for pointing out 296ccef. I think I will use the to make this PR more explict.

…ssible

This change effectively reverts 296ccef (https://reviews.llvm.org/D77192)

Most of these symbols are just normal C symbols that get imported from
wither libcompiler-rt or from emscripten's JS library code.  In most
cases it should not be necessary to give them explicit import names.

The advantage of doing this is that we can wasm-ld can/will fail with
a useful error message when these symbols are missing.  As opposed to
today where it will simply import them and defer errors until later
(when they are less specific).
@sbc100
Copy link
Collaborator Author

sbc100 commented Feb 25, 2025

Updated to reflect that this is basically a revert of 296ccef.

@sbc100
Copy link
Collaborator Author

sbc100 commented Feb 25, 2025

Gentle ping (I've got an emscripten-side change waiting on this). BTW I ran the whole other and core tests suites in emscripten with this patch applied.

Module *M) {
Function* F = Function::Create(Ty, GlobalValue::ExternalLinkage, Name, M);
static Function *getFunction(FunctionType *Ty, const Twine &Name, Module *M) {
return Function::Create(Ty, GlobalValue::ExternalLinkage, Name, M);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function really an improvement now over just calling Function::Create directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to switch to that if you prefer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it avoids typing GlobalValue::ExternalLinkage a lot?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess I could go either way on that. LGTM for whatever you want to do.

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

Chatted with @sbc100 offline, and answering my own question: The reason that we needed this years ago but not anymore is because we generate libemscripten_js_symbols.so file that supplies all JS symbol names so wasm-ld knows they should be imported: https://github.com/emscripten-core/emscripten/blob/71e31f4cbbfd922d16fe814769590be4342d1e17/tools/building.py#L126-L136

@sbc100 sbc100 merged commit 147d9d6 into llvm:main Feb 26, 2025
11 checks passed
@sbc100 sbc100 deleted the EmscriptenEHSjLj branch February 26, 2025 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants