-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GeneratorEngine
module with tests and macros
#31
Conversation
In subsequent PRs I'd like to introduce a SQLite-backed cache for storing hashes of generated artifacts. That will help us in avoiding redundant work and will make generator significantly quicker for re-runs that change only a few arguments.
@swift-ci test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vendored from TSC with minimal changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to use Vapor's sqlite-nio over this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this code to be easily portable to Windows, and IIUC sqlite-nio will make that much harder for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vendored from TSC with minimal changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vendored from TSC with minimal changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really need file locks? Those have very bad and annoying properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be a good alternative for that?
@swift-ci test |
@swift-ci test |
@swift-ci test |
@swift-ci test |
@swift-ci test |
Curious to hear @weissi's opinion on this too, specifically |
|
||
extension FilePath: CacheKeyProtocol { | ||
public func hash(with hashFunction: inout some HashFunction) { | ||
self.description.hash(with: &hashFunction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.description.hash(with: &hashFunction) | |
self.string.hash(with: &hashFunction) |
|
||
extension Optional: CacheKeyProtocol where Wrapped: CacheKeyProtocol { | ||
public func hash(with hashFunction: inout some HashFunction) { | ||
if let self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not ideal. You want to do hash
hashOf(0 or false)
if nil
. And hashOf(1 or true) ++ hashOf(self)
if not nil.
Otherwise you'll get hash unnecesary hash collisions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxDesiatov Does the code here address @weissi 's comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why it doesn't show "outdated" here, I've updated it this way to address the comment:
extension Optional: CacheKeyProtocol where Wrapped: CacheKeyProtocol {
public func hash(with hashFunction: inout some HashFunction) {
"Swift.Optional".hash(with: &hashFunction)
if let self {
true.hash(with: &hashFunction)
self.hash(with: &hashFunction)
} else {
false.hash(with: &hashFunction)
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that might be it, but I was having trouble following the PR history to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to use Vapor's sqlite-nio over this
Sources/GeneratorEngine/Engine.swift
Outdated
} | ||
|
||
deinit { | ||
try! resultsCache.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fraught with peril. Not guaranteed to happen, might block etc. Would very much recommend to make the lifecycle explicit and to assert in deinit that it has been shut down.
Sources/GeneratorEngine/Engine.swift
Outdated
get async throws { | ||
var hashFunction = SHA512() | ||
query.hash(with: &hashFunction) | ||
let key = hashFunction.finalize().description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.description
's output is usually not API. Is this truly the right thing to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed SQLiteBackedCache
type to take some Sequence<UInt8>
as keys to improved the situation. But there are still types like FileCacheRecord
that are encoded into JSON and we need to serialize hashes as strings for that. But that stems from the fact that SQLiteBackedCache
is a NoSQL store at this stage. I'll normalize it to make it properly relational in a future PR.
return nil | ||
} | ||
|
||
return .init(buffer[0..<bytesRead]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this allocates and copies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you probably want to removeLast(n) instead
} | ||
|
||
deinit { | ||
try! fileDescriptor.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a very bad idea. Managing file descriptors with deinit
is all but guaranteed to break everything.
Make the lifecycle explicit or you'll suffer hard. Also close
should be async
.
I'd recommend try await withOpenFile(...) { file in try await file.write(...) }
such that you can't forget the closes.
func read<V: Decodable>(_ path: FilePath, as: V.Type) async throws -> V { | ||
let fileStream = try await self.read(path) | ||
var bytes = [UInt8]() | ||
for try await chunk in fileStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd ask the user to provide a limit here and maybe default it to 10 MiB or something. Limitless memory ballooning is never a good API.
bytes += chunk | ||
} | ||
|
||
return try decoder.decode(V.self, from: .init(bytes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this allocates and copies from
Array
->Data
. Would recommend to accumulate inData
- would recommend to never use
.init
, it obfuscates the code a lot. Some.init
s allocate & copy, others don't, very very hard to see here what's going on.
|
||
func write(_ path: FilePath, _ value: some Encodable) async throws { | ||
let data = try encoder.encode(value) | ||
try await self.write(path, .init(data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocate & copy here too (might be hard to avoid)
macOS CI currently blocked by swiftlang/swift#69485 due to an older version of Xcode installed on CI that doesn't support extension macros, so we'll have to either force merge or disable the macOS check temporarily |
@swift-ci test |
@swift-ci test |
@swift-ci test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've built and tested an SDK with this changeset. The non-Docker case works, and not rebuilding lld
is a welcome speedup.
I'm having problems with --with-docker
SDKs, both for RHEL and Ubuntu:
% swift build --experimental-swift-sdk 5.9-RELEASE_ubuntu_jammy_x86_64
Building for debugging...
warning: Could not read SDKSettings.json for SDK at: /Users/euanh/Library/org.swift.swiftpm/swift-sdks/5.9-RELEASE_ubuntu_jammy_x86_64.artifactbundle/5.9-RELEASE_ubuntu_jammy_x86_64/x86_64-unknown-linux-gnu/ubuntu-jammy.sdk
<unknown>:0: warning: glibc not found for 'x86_64-unknown-linux-gnu'; C stdlib may be unavailable
<unknown>:0: warning: glibc not found for 'x86_64-unknown-linux-gnu'; C stdlib may be unavailable
warning: Could not read SDKSettings.json for SDK at: /Users/euanh/Library/org.swift.swiftpm/swift-sdks/5.9-RELEASE_ubuntu_jammy_x86_64.artifactbundle/5.9-RELEASE_ubuntu_jammy_x86_64/x86_64-unknown-linux-gnu/ubuntu-jammy.sdk
error: link command failed with exit code 1 (use -v to see invocation)
ld.lld: error: cannot open Scrt1.o: No such file or directory
ld.lld: error: cannot open crti.o: No such file or directory
ld.lld: error: unable to find library -lc
ld.lld: error: cannot open crtn.o: No such file or directory
clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
error: fatalError
[5/6] Linking hello-world
However I can reproduce these problems on main
, even going back to 321e22a. There might be something wrong with my local setup, but it certainly doesn't seem to be a regression caused by this change. I'll continue investigating but I don't think it should block merging.
We'd like to avoid redundant rebuilding and regeneration for inputs that don't change. The new
GeneratorEngine
module achieves that by introducing aQuery
abstraction, which can be hashed and matched to its cached outputs if any are available.The new
Engine
actor can run suchQuery
and immediately return a cached result file if one is present and its hash matches the recorded one.This required introducing new
@Query
and@CacheKey
macros for making hashing easy and consistent onQueryProtocol
types and their inputs. We can't use the standardHashable
protocol, as that produces new hashes on every process launch due to randomized hashing seeds. With Swift Crypto'sHashFunction
we can generate consistent hashes and record them inSQLiteBackedCache
.To make the caching engine testable, I've introduced a new
FileSystem
actor protocol withVirtualFileSystem
andLocalFileSystem
actor implementations.