-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for dynamically linked crates #134767
base: master
Are you sure you want to change the base?
Conversation
cc @m-ou-se |
This comment has been minimized.
This comment has been minimized.
I think that is indicative of a fundamental issue with your implementation. Adding new private impls should not change the ABI. The whole point of the RFC is to allow changing the implementation of the dylib without changing it's ABI. We already have support for dylibs where the ABI depends on the implementation. |
Also the interface file is missing either the Replacing all function bodies in an interface file with |
let interface_dir = interface.parent().unwrap(); | ||
let crate_name = crate_name.unwrap(); | ||
|
||
let res = std::process::Command::new(compiler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be done either within the current compiler session or as a separate compiler session in the same process. As implemented this will break rustdoc and any other custom driver that has a different cli from rustc itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a blocking issue? I don't see an easy way to do this. (I have left a FIXME
comment at the moment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not, but at the very least make sure to put the produced rmeta file in a temporary directory. As is you could have multiple rustc instances race trying to write an rmeta file and then read a partially written rmeta file.
Maybe an alternative to invoking rustc from within rustc would be to require the end user to compile interface files to rmeta files themself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an alternative to invoking rustc from within rustc would be to require the end user to compile interface files to rmeta files themself?
My concern is that in Rust, files generally do not have a special meaning, unlike C++. A translation unit i.e. a crate is not a single file, it consists of modules. Modules, in turn, can be declared either in one file or divided into several. That's why the "interface file" isn't a very coherent concept in Rust. I would like to avoid adding an additional level of complexity for users until it is proven necessary. Therefore, the initial plan was to make the interfaces completely invisible to users. I also planned to put them in the dylib, but this has not been done yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A translation unit i.e. a crate is not a single file, it consists of modules.
There is a single interface file for the entire crate, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but at the very least make sure to put the produced rmeta file in a temporary directory
Done
A stable ABI across rustc versions is a lot more involved than just mangling all symbols the same. You have to make sure multiple copies of the standard library seamlessly interoperate including using the same Edit: To put it another way, I did strongly prefer if stable ABI across rustc versions and stable ABI across crate versions with a single rustc version are treated as entirely separate problems implemented entirely separately. For stable ABI across crate versions you don't need to generate interface files. You can just reuse the existing rmeta file format, but only encode the subset of the items corresponding to the stable ABI. |
I can suggest the following solution for the above problem with impl's: Split indices (disambiguators) into 2 sets:
One of the main objectives of this proposal is to allow building the library and the application with different compiler versions. This requires either a metadata format compatible across rustc versions or some form of source code. Stable metadata format is not a part of the RFC.
"stable ABI" is a bad wording here. Neither I nor the RFC offers a stable ABI. And all these issues are outside the scope of the proposal. |
These two statements are conflicting with each other. Being able to build a library and application with a different compiler version requires both to share an ABI, in other words it requires having a stable ABI. |
Only |
This comment was marked as resolved.
This comment was marked as resolved.
This comment has been minimized.
This comment has been minimized.
This comment was marked as resolved.
This comment was marked as resolved.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment was marked as resolved.
This comment was marked as resolved.
Some changes occurred in compiler/rustc_codegen_gcc Some changes occurred in compiler/rustc_passes/src/check_attr.rs Some changes occurred in src/tools/rustfmt cc @rust-lang/rustfmt This PR modifies cc @jieyouxu These commits modify the If this was unintentional then you should revert the changes before this PR is merged. |
r? @bjorn3 (since you already did a bunch fo reviewers, or reroll) |
I can split it into several PR's/commits if necessary. |
@@ -1002,7 +1002,9 @@ impl<'a, 'tcx> Builder<'a, 'tcx> { | |||
self.source_scope = source_scope; | |||
} | |||
|
|||
if self.tcx.intrinsic(self.def_id).is_some_and(|i| i.must_be_overridden) { | |||
if self.tcx.intrinsic(self.def_id).is_some_and(|i| i.must_be_overridden) | |||
|| self.tcx.is_interface_build() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just skip building MIR entirely rather than adding a ad-hoc hack like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how to do this correctly. Many analysis passes depend on mir_built
. Adding checks around these passes seems like a bad idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe skip it in encode_mir
in metadata encoding?
Since we are in --emit=metadata
most MIR won't (?) be built unless it's needed for encoding.
let interface_dir = interface.parent().unwrap(); | ||
let crate_name = crate_name.unwrap(); | ||
|
||
let res = std::process::Command::new(compiler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not, but at the very least make sure to put the produced rmeta file in a temporary directory. As is you could have multiple rustc instances race trying to write an rmeta file and then read a partially written rmeta file.
Maybe an alternative to invoking rustc from within rustc would be to require the end user to compile interface files to rmeta files themself?
@rustbot ready |
☔ The latest upstream changes (presumably #137927) made this pull request unmergeable. Please resolve the merge conflicts. |
Ping @bjorn3 , since it has been 2 weeks. |
@@ -82,6 +82,8 @@ struct AstValidator<'a> { | |||
/// Used to ban explicit safety on foreign items when the extern block is not marked as unsafe. | |||
extern_mod_safety: Option<Safety>, | |||
|
|||
is_interface: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_interface: bool, | |
is_sdylib_interface: bool, |
In other places (pretty-printing, etc) too.
This will be more useful to any unsuspecting reader looking at the code.
s.s.eof() | ||
} | ||
|
||
pub fn print_crate_as_interface( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to add a new flag to fn print_crate
instead of introducing a new function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a new function. For the interface dump, PpAnn
must always be fixed and comments must always be skipped.
@@ -2872,7 +2876,11 @@ fn add_upstream_rust_crates( | |||
} | |||
Linkage::Dynamic => { | |||
let src = &codegen_results.crate_info.used_crate_source[&cnum]; | |||
add_dynamic_crate(cmd, sess, &src.dylib.as_ref().unwrap().0); | |||
let lib = match &src.dylib.as_ref() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let lib = match &src.dylib.as_ref() { | |
let lib = match &src.dylib.as_ref().unwrap_or_else(...) |
@@ -44,7 +44,7 @@ pub fn crates_export_threshold(crate_types: &[CrateType]) -> SymbolExportLevel { | |||
} | |||
|
|||
fn reachable_non_generics_provider(tcx: TyCtxt<'_>, _: LocalCrate) -> DefIdMap<SymbolExportInfo> { | |||
if !tcx.sess.opts.output_types.should_codegen() { | |||
if !tcx.sess.opts.output_types.should_codegen() && !tcx.is_interface_build() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even get to this logic in --emit=metadata
builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see encode_crate_root
.
@@ -172,7 +172,7 @@ fn exported_symbols_provider_local( | |||
tcx: TyCtxt<'_>, | |||
_: LocalCrate, | |||
) -> &[(ExportedSymbol<'_>, SymbolExportInfo)] { | |||
if !tcx.sess.opts.output_types.should_codegen() { | |||
if !tcx.sess.opts.output_types.should_codegen() && !tcx.is_interface_build() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as for fn reachable_non_generics_provider
.
) | ||
})?; | ||
|
||
let tmp_path = match tempfile::tempdir() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other code uses TempFileBuilder
for this.
let tmp_path = match tempfile::tempdir() { | ||
Ok(tmp_path) => tmp_path, | ||
Err(error) => { | ||
return Err(MetadataError::LoadFailure(format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustc_metadata
has the FailedCreateTempdir
error type for this.
} | ||
}; | ||
|
||
let crate_name = dyn_crate.unwrap_or(kw::Empty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let crate_name = dyn_crate.unwrap_or(kw::Empty); | |
let crate_name = dyn_crate.unwrap(); |
See the comment about assert above.
@@ -1746,7 +1746,9 @@ impl<'tcx> TyCtxt<'tcx> { | |||
MetadataKind::None | |||
} | |||
CrateType::Rlib => MetadataKind::Uncompressed, | |||
CrateType::Dylib | CrateType::ProcMacro => MetadataKind::Compressed, | |||
CrateType::Dylib | CrateType::ProcMacro | CrateType::Sdylib => { | |||
MetadataKind::Compressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sdylib
s aren't supposed to have metadata (well, in the usual unstable format at least).
Unless the interface libraries are also considered CrateType::Sdylib
s during the build.
CrateType::Sdylib
with and without is_interface_build
behave quite differently.
Maybe they should be split into two crate types, or CrateType::Sdylib
should be reset to CrateType::Rlib
or something during the interface build? Not sure.
Or maybe crate_type
and is_interface_build
should just be passed together to all places where the distinction is necessary.
.unwrap_or(self.target.dll_suffix.as_ref()); | ||
let dylib = interface.0.with_extension(suffix); | ||
if !dylib.exists() { | ||
return Err(CrateError::ExternLocationNotExist(self.crate_name, dylib)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, all these errors should be covered by tests (unless they occur due to some external system failures that are hard to reproduce).
Same applies to the new errors in compiler/rustc_passes/src/errors.rs
.
tcx: TyCtxt<'tcx>, | ||
exportable_items: FxIndexSet<DefId>, | ||
in_exportable_mod: bool, | ||
catch_exportable_in_mod: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catch_exportable_in_mod: bool, | |
seen_exportable_in_mod: bool, |
if self.tcx.get_attr(def_id, sym::export).is_some() { | ||
self.in_exportable_mod = true; | ||
} | ||
let old_catch_exportable_in_mod = self.catch_exportable_in_mod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let old_catch_exportable_in_mod = self.catch_exportable_in_mod; | |
let old_catch_exportable_in_mod = mem::replace(&mut self.catch_exportable_in_mod, false); |
&& impl_.of_trait.is_none() | ||
&& self.tcx.is_exportable(item.owner_id.def_id.to_def_id()) | ||
{ | ||
self.order.insert(item.owner_id.def_id.to_def_id(), self.id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.id
is always equal to self.order.len()
here, no need to track it.
stable_order_of_exportable_impls, | ||
..*providers | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look at the logic in this file in detail, it 100% has issues, but it should be ok approximation for a first prototype.
} | ||
match item.kind { | ||
ItemKind::ExternCrate(Some(orig_name)) | ||
| ItemKind::ExternDynCrate(Some(orig_name)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extern dyn crate self as foo;
should produce an error even earlier, so this probably shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extern dyn crate self as foo;
should produce an error even earlier
I tried it. It is not true.
@@ -1049,6 +1050,10 @@ impl OutputFilenames { | |||
.unwrap_or_else(|| OutFileName::Real(self.output_path(flavor))) | |||
} | |||
|
|||
pub fn interface_path(&self) -> PathBuf { | |||
self.out_directory.join(format!("lib{}.{}", self.crate_stem, RUST_EXT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.out_directory.join(format!("lib{}.{}", self.crate_stem, RUST_EXT)) | |
self.out_directory.join(format!("lib{}.rs", self.crate_stem)) |
The constant isn't used anywhere else.
} | ||
|
||
impl CrateType { | ||
pub fn has_metadata(self) -> bool { | ||
match self { | ||
CrateType::Rlib | CrateType::Dylib | CrateType::ProcMacro => true, | ||
CrateType::Rlib | CrateType::Dylib | CrateType::ProcMacro | CrateType::Sdylib => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as in https://github.com/rust-lang/rust/pull/134767/files#r2001606923.
use rustc_span::symbol::{Symbol, sym}; | ||
|
||
trait AbiHashStable<'tcx> { | ||
fn hash(&self, tcx: TyCtxt<'tcx>, hasher: &mut StableHasher); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a trait? It's not currently used generically.
Also, probably better to rename the method to something like abi_hash
to avoid confusing it with regular hash
.
v0::mangle(tcx, instance, instantiating_crate) | ||
}), | ||
let mut symbol = match tcx.is_exportable(def_id) { | ||
true => v0::mangle(tcx, instance, instantiating_crate), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true => v0::mangle(tcx, instance, instantiating_crate), | |
true => format!("{}_{}", v0::mangle(tcx, instance, instantiating_crate), export::compute_hash_of_export_fn(tcx, instance)) |
@@ -32,6 +32,7 @@ pub(super) fn mangle<'tcx>( | |||
let mut cx: SymbolMangler<'_> = SymbolMangler { | |||
tcx, | |||
start_offset: prefix.len(), | |||
is_exportable: tcx.is_exportable(def_id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pass this as an argument to fn mangle
, we already know whether it's exportable when calling this.
There are also several big, but non-blocking, issues here.
|
The PR description also needs some updates. |
This PR is an initial implementation of rust-lang/rfcs#3435 proposal.
component 1: interface generator
Interface generator - a tool for generating a stripped version of crate source code. The interface is like a C header with only "exported" items included, and function bodies are omitted. For example, initial crate:
generated interface:
In this example
bar
function is not a part of the interface as it doesn't have#[export]
attribute.To emit interface use a new
sdylib
crate type which is basically the same asdylib
, but it also produces the interface as a second artifact. The current interface name islib{crate_name}.rs
.Interface generator was implemented as part of the HIR pretty-printer. In order to filter out unnecessary items, the
PpAnn
trait was used.Why was it decided to use a design with an auto-generated interface?
One of the main objectives of this proposal is to allow building the library and the application with different compiler versions. This requires either a metadata format compatible across rustc versions or some form of a source code. The option with a stable metadata format was rejected because it is not a part of the RFC. (discussion)
Regarding the design with interfaces there are 2 possibilities: manually written or auto-generated. I would personally prefer the auto-generated interface for the following reason: we can put it in the dynamic library instead of metadata, which will make it completely invisible to users. (this was my initial plan, but since the PR is already big enough, I decided to postpone it)
But even if we end up with a different design, I believe the interface generator could be a useful tool for testing and experimenting with the entire feature.
component 2: crate loader
When building dynamic dependencies, the crate loader searches for the interface in the file system, builds the interface without codegen and loads it's metadata. For now, it's assumed that interface and dynamic lib are located in the same directory.
extern dyn crate
annotation serves as a signal for the building of a dynamic dependency.Here are the code and commands that corresponds to the compilation process:
P.S. The interface name/format and rules for file system routing can be changed further.
component 3: exportable items collector
Query for collecting exportable items. Which items are exportable is defined here .
component 4: "stable" mangling scheme
The mangling scheme proposed in the RFC consists of two parts: a mangled item path and a hash of the signature.
mangled item path
For the first part of the symbol it has been decided to reuse the
v0
mangling scheme as it is mostly independent of the compiler internals.The first exception is an impl's disambiguation. During symbol mangling rustc uses a special index to distinguish between two impls of the same type in the same module(See
DisambiguatedDefPathData
). The calculation of this index may depend on private items, but private items should not affect the ABI. Example:In order to make disambiguation independent of the compiler version we can assign an id to each impl according to their relative order in the source code. However, I have not found the right way to get this order, so I decided to use:
intravisit::Visitor
traversal.#[rustc_stable_impl_id]
that outputs this sequential number as a compiler error. If the visitor's implementation is changed, the corresponding test will fail. Then you will need to rewrite the implementation ofstable_order_of_exportable_impls
query to preserve order.P.S. is it possible to compare spans instead?
The second exception is
StableCrateId
which is used to disambiguate different crates.StableCrateId
consists of crate name,-Cmetadata
arguments and compiler version. At the moment, I have decided to keep only the crate name, but a more consistent approach to crate disambiguation could be added in the future.hash of the signature
Second part of the symbol name is 128 bit hash containing relevant type information. For now, it includes:
#[export(unsafe_stable_abi = "hash")]
syntax for ADT types with private fields is not yet implemented.