-
Notifications
You must be signed in to change notification settings - Fork 69
Modify AsmJS SizeType to match Wasm #10
Modify AsmJS SizeType to match Wasm #10
Conversation
AsmJS uses unsigned ints for size_t, which causes some libcxx functions to have different symbols in AsmJS than with Wasm, which uses unsigned longs. In principle we want AsmJS and Wasm to have the same ABI, so this serves as a first step toward ensuring that they match.
What is the benefit for making the change this way versus the reverse? |
The two main reasons I know of are 1) TargetInfo defaults to UnsignedLong, so it seems more consistent with more targets, although I don't know how to easily tell how many targets do what. 2) For 64-bit wasm, we'll need to use long (or long long), so keeping it long makes that consistent across asmjs, wasm32, and wasm64, which means they can all have the same symbols. I'm not fully convinced either of those are particularly compelling. What are some benefits to the reverse? |
Reasonable points, yeah. A downside is that it means current emscripten users need to rebuild their projects. We can (and must) bump the version number which would force a system lib rebuild, but don't have a solution for user project code. So this could cause confusion and problems. |
Aren't we already making them recompile because of the other ABI changes for emscripten-core/emscripten#4340 ? This is no different from those. |
Well, it depends how we resolve that issue :) Last I heard @sunfishcode was going to try out a long double trick. |
@juj, curious what you think the risk is for existing users here. |
It's not just long doubles though, is it? The ABIs have to be the same top to bottom if we are going to have compatible bitcode for e.g. the unity use case. |
Basically it means that you won't be able to link bitcode object files compiled with 2 different versions of the toolchain. This isn't a common use case in the absence of dynamic linking, people generally would do a clean build if they change their toolchain anyway (and is why the fact that we haven't promised a stable ABI so far is not huge deal). There could be cases where someone is shipping a proprietary bitcode library to users who include it in their builds, where they will have to ensure that the versions match. But those users would also get a big benefit from the asm.js and wasm ABIs matching, so I think this would be good for them on balance anyway. |
Yeah, the risk is people with a bitcode blob of library code, or that happen to have a project where some files haven't been touched in a while so they weren't rebuilt (and their build system doesn't rebuild on a new compiler version). I wish we had a way to warn people about this (like if we stored the ABI hash in bitcode files), but currently don't. |
Maybe the first question was the one asked upthread, which is "which one to use?" Using unsigned long makes us the same as ARM (and much of our other ABI surface is like ARM already) and the same as a hypothetical wasm64 (for whatever that's worth; maybe not much). On the other hand unsigned int makes us the same as most other ILP32 targets (i386, X32, NaCl, Mips32, PPC32), so it's not 100% clear. |
One thing I don't get - what would be the benefit for wasm32 to be the same as wasm64 in this respect? Surely they can't be linked anyhow. |
Yeah, they can't be linked. But libraries that export mangled names (e.g. libcxx) would export the same set of symbols, which... might be nice? |
For what it's worth, I ran into this in the first place because wasm and asm use different symbols, so a test case that only uses |
BTW that discrepancy is actually the proximate cause that motivated this change in the first place. Currently asm.js and wasm have different signatures (and therefore different mangled names) for |
I see. Then another option might be to keep asm.js as is, and have different |
Right. Or even get rid of precomputed symbols files (although getting the symbols with |
Note that we use the precomputed symbol files for another purpose, to know which system libs need to be built. When a user installs emscripten, there are no system libs, and if they compile a file that needs libc++, we build libc++ based on the symbol file info. I agree having the same ABI makes sense, but how to do it and when are open questions. Two symbol files might be a temporary solution to unblock us for now. (e.g. maybe we can do that until we find a way to warn people about bitcode incompatibilities). For my side, I'm mostly waiting to hear from @juj (who is in another time zone), he has more experience than me with game devs using emscripten, and I want to hear his estimate of the risk of an ABI change like this. |
Right, makes sense. I think we should move away from building target libraries on demand, and just ship a sysroot, but that's a discussion for another bug :) So I pinged Roland for advice about int vs long (turns out I was wrong about EABI ARM, it's unsigned int too), and while there's (as we thought) not too many reasons to use one or the other on ILP32, he suggested unsigned int. So maybe we can just switch wasm to match all the other ILP32 targets in this case. |
Which winds up being a smaller change overall because we won't need to do anything with I'm ok with closing this then. |
cc @sunfishcode |
Per our discussion about the broader merging of the asm.js and wasm ABI, triple, etc: we eventually want to merge the ABIs. Also it might be nice to have the same symbols for wasm32, wasm64 and asm.js, which would mean unsigned long for size_t. But there are a few steps between there and here, and we don't want a bunch of incremental breakage of the asm.js ABI in the meantime. So I think the plan is for now to switch the wasm frontend to unsigned long to fix the breakage for now, and then move everything at once later. |
Switching to wasm frontend using unsigned ints here: https://reviews.llvm.org/D24134 |
Summary: The test being added in this patch used to cause an assertion failure: /build/./bin/clang -cc1 -internal-isystem /build/lib/clang/5.0.0/include -nostdsysteminc -verify -fsyntax-only -std=c++11 -Wshadow-all /src/tools/clang/test/SemaCXX/warn-shadow.cpp -- Exit Code: 134 Command Output (stderr): -- clang: /src/tools/clang/lib/AST/ASTDiagnostic.cpp:424: void clang::FormatASTNodeDiagnosticArgument(DiagnosticsEngine::ArgumentKind, intptr_t, llvm::StringRef, llvm::StringRef, ArrayRef<DiagnosticsEngine::ArgumentValue>, SmallVectorImpl<char> &, void *, ArrayRef<intptr_t>): Assertion `isa<NamedDecl>(DC) && "Expected a NamedDecl"' failed. #0 0x0000000001c7a1b4 PrintStackTraceSignalHandler(void*) (/build/./bin/clang+0x1c7a1b4) #1 0x0000000001c7a4e6 SignalHandler(int) (/build/./bin/clang+0x1c7a4e6) #2 0x00007f30880078d0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0xf8d0) #3 0x00007f3087054067 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x35067) #4 0x00007f3087055448 abort (/lib/x86_64-linux-gnu/libc.so.6+0x36448) #5 0x00007f308704d266 (/lib/x86_64-linux-gnu/libc.so.6+0x2e266) #6 0x00007f308704d312 (/lib/x86_64-linux-gnu/libc.so.6+0x2e312) #7 0x00000000035b7f22 clang::FormatASTNodeDiagnosticArgument(clang::DiagnosticsEngine::ArgumentKind, long, llvm::StringRef, llvm::StringRef, llvm::ArrayRef<std::pair<clang::DiagnosticsEngine::ArgumentKind, long> >, llvm::SmallVectorImpl<char>&, void*, llvm::ArrayRef<long>) (/build/ ./bin/clang+0x35b7f22) #8 0x0000000001ddbae4 clang::Diagnostic::FormatDiagnostic(char const*, char const*, llvm::SmallVectorImpl<char>&) const (/build/./bin/clang+0x1ddbae4) #9 0x0000000001ddb323 clang::Diagnostic::FormatDiagnostic(char const*, char const*, llvm::SmallVectorImpl<char>&) const (/build/./bin/clang+0x1ddb323) #10 0x00000000022878a4 clang::TextDiagnosticBuffer::HandleDiagnostic(clang::DiagnosticsEngine::Level, clang::Diagnostic const&) (/build/./bin/clang+0x22878a4) #11 0x0000000001ddf387 clang::DiagnosticIDs::ProcessDiag(clang::DiagnosticsEngine&) const (/build/./bin/clang+0x1ddf387) #12 0x0000000001dd9dea clang::DiagnosticsEngine::EmitCurrentDiagnostic(bool) (/build/./bin/clang+0x1dd9dea) #13 0x0000000002cad00c clang::Sema::EmitCurrentDiagnostic(unsigned int) (/build/./bin/clang+0x2cad00c) #14 0x0000000002d91cd2 clang::Sema::CheckShadow(clang::NamedDecl*, clang::NamedDecl*, clang::LookupResult const&) (/build/./bin/clang+0x2d91cd2) Stack dump: 0. Program arguments: /build/./bin/clang -cc1 -internal-isystem /build/lib/clang/5.0.0/include -nostdsysteminc -verify -fsyntax-only -std=c++11 -Wshadow-all /src/tools/clang/test/SemaCXX/warn-shadow.cpp 1. /src/tools/clang/test/SemaCXX/warn-shadow.cpp:214:23: current parser token ';' 2. /src/tools/clang/test/SemaCXX/warn-shadow.cpp:213:26: parsing function body 'handleLinkageSpec' 3. /src/tools/clang/test/SemaCXX/warn-shadow.cpp:213:26: in compound statement ('{}') /build/tools/clang/test/SemaCXX/Output/warn-shadow.cpp.script: line 1: 15595 Aborted (core dumped) /build/./bin/clang -cc1 -internal-isystem /build/lib/clang/5.0.0/include -nostdsysteminc -verify -fsyntax-only -std=c++11 -Wshadow-all /src/tools/clang/test/SemaCXX/warn-shadow.cpp Reviewers: rsmith Reviewed By: rsmith Subscribers: krytarowski, cfe-commits Differential Revision: https://reviews.llvm.org/D33207 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@303325 91177308-0d34-0410-b5e6-96231b3b80d8
AsmJS uses unsigned ints for size_t, which causes some libcxx functions to have different symbols in AsmJS than with Wasm, which uses unsigned longs.
In principle, long-term, we want AsmJS and Wasm to have the same ABI, so this serves as a first step toward ensuring that they match.