-
Notifications
You must be signed in to change notification settings - Fork 6
[WIP] GRDB Support #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
func __leaseAll(callback: @escaping (Any, [Any]) -> Void) async throws { | ||
// TODO, actually use all connections | ||
do { | ||
try await pool.write { pointer in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @stevensJourney - it looks like you're about to need a way to iterate all available connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi!, yes, in order to completely satisfy our internal driver requirements, we would require this.
I haven't full scanned through the GRDB docs yet, but I haven't seen a way yet to achieve this. Is this currently possible with GRDB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not yet, that would be a new feature. We'd have to clarify the exact required semantics.
In particular, I wonder if the abstract pool you have to conform to is assumed to run a fixed set of available connections, or if it is allowed to close existing connections and open new ones.
- If it is assumed to run a fixed set of available connections, then
leaseAll
makes sure that all future database access are impacted by the effects of the callback. - If the pool can close and open connections at will, then it is possible for a future database access to run in a connection that did not run the callback.
To lift the ambiguity, the semantics of leaseAll
need to be clarified.
There is another related clarification that is needed: can concurrent database accesses run during the execution of leaseAll
, or not?
Maybe there are other constraints that I'm not aware of. In all cases, make sure you make the semantics crystal clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, GRDB has a prepareDatabase
callback that executes code in any connection that opens, before it is made available to the rest of the application. This is where GRDB users register functions, collations, and make general connection setup:
var config = Configuration()
config.prepareDatabase { db in
// Setup stuff on demand
}
let dbPool = try DatabasePool(path: "...", configuration.config)
try dbPool.read { db in // or dbPool.write
// Setup has run on that connection, guaranteed
}
If the only goal of leaseAll
is to execute database code early, then I would suggest replacing it with a similar pattern in the PowerSync pool protocol. This should be possible in all target languages that have a notion of closure that can be executed later. And this would void all the complex semantics questions I have asked above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, if the goal of leaseAll
is to call sqlite3_db_release_memory()
, then DatabasePool
has a ready-made releaseMemory()
method.
In summary, I strongly suggest clarifying the intent behind leaseAll
, so that we avoid the XY problem (and leaseAll
has a big XY smell).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed responses and queries.
The current implementation of leaseAll
in this SDK assumes a lock is taken on all the current SQLite connections. The pool can be fixed or dynamically sized - the only requirement is that the lock be taken at the time of and during the request.
prepareDatabase
is a good solution for executing code when connections are opened, but this does not align with the current requirements of leaseAll
.
You are very correct about the XY Problem scenario described. For context, we allow users to update the PowerSync schema after the client has been initialised. E.g. This is triggered here, We currently apply the change using a write connection and thereafter refresh the schema on the read connections - preventing any reads during this operation: which might be invalid.
If are any alternatives for ensuring the Schema is refreshed on read connections, I think we could avoid requiring this functionality for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for the clarification 👍 When you're on the leaseAll implementation, please open a new discussion in the GRDB repo. All the building blocks are there, we just need to design a public API.
Sources/PowerSyncGRDB/GRDBPool.swift
Outdated
) { | ||
// Register the PowerSync core extension | ||
config.prepareDatabase { database in | ||
guard let bundle = Bundle(identifier: "co.powersync.sqlitecore") else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think on watchOS we would have to call the powersync_init_static()
C function instead, since everything is linked statically there.
schema: schema | ||
) | ||
if let schemaSource = schemaSource { | ||
self.schemaSource = schemaSource.then(powerSyncSchemaSource) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙂 This makes me happy when a feature is used as intended.
I would reverse the chain, though: powerSyncSchemaSource.then(schemaSource)
. You can, and should, do that because PowerSyncSchemaSource
is well-behaved: it returns nil for views it does not own.
The current code runs the current schema source before powerSyncSchemaSource
. This is fragile, because this only works if the current schema source is well-behaved. Unfortunately, you do not control it. You can write in your documentation that PowerSync requires a well-behaved schema source... But why bother, when you can chain in the opposite direction, and avoid all problems.
That's how then
rewards the well-behaved schema sources ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! That's a good point. We'll update the order of the chain :)
|
||
let brokerPointer = Unmanaged.passUnretained(updateBroker).toOpaque() | ||
|
||
/// GRDB only registers an update hook if it detects a requirement for one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution: you rely on undocumented behavior, here. This may break at any point in the future, unless we discuss and settle on a documented behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right - this implementation relies on undocumented behavior and should be used with caution. I developed this approach after examining GRDB's TransactionObserver
logic, but the approach here has clear limitations.
The core issue is that GRDB's table update notifications seem to work for changes made within GRDB-managed transactions. However, our PowerSync integration has different requirements:
- We use the raw SQLite connection pointer directly for operations
- We manage transactions internally after obtaining a write connection lease
This creates a mismatch with GRDB's expected usage patterns. While we could potentially restructure our base Kotlin SDK to better align with GRDB's transaction model, that would require more effort.
I implemented this solution because it was the fastest path forward, but I recognize it's far from ideal. Moving forward, I can think of two alternatives:
SQLite Session API: This might provide the information we need on our side, though I haven't investigated this approach yet.
GRDB API: Would it be possible to add functionality to GRDB that exposes table updates for our use case? I may have missed an existing API that could work here.
This current implementation definitely raises red flags due to its reliance on GRDB's internal hook management behavior. What are your thoughts on potential solutions or API additions that could provide a more robust integration path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are good questions. The first one is: since you take care of dealing with another schemaSource
in another place, but install your own update_hook, I don't quite know if you intend to share the database file with the user, or not.
If you share the database file with the user, allowing him to perform his own database observation through GRDB APIs, then installing your own update_hook will conflict.
If you do not share the database file with the user, then you do not need to handle the eventual other schemaSource
.
Please let's clarify this question before we proceed with finding the best technique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick feedback!
The current vision is for the primary schema to be managed by PowerSync. Where a single DatabasePool
is used by both PowerSync and external GRDB calls and observations. The hopes being that:
- A DatabasePool can be instantiated which is configured to be compatible with PowerSync
- PowerSync configures the primary SQLite Views and underlaying tables based off the PowerSync client side schema provided. Possibly, in some advanced cases users might configure their own tables outside of PowerSync - these tables won't be touched by PowerSync - this is not necessarily a use-case we'd currently want to optimize for.
- PowerSync should use managed leases to the connections in the DatabasePool by making requests to
read
/write
(writeWithoutTransaction). - It's assumed that it should be safe for other code to use the same DatabasePool's
read
andwrite
methods alongside PowerSync.
The update hook here is only registered inside a writeWithoutTransaction
closure. Assuming no other external write operations could occur at this time? The updates detected by PowerSync are then relayed to observers with the notifyChanges
API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- PowerSync configures the primary SQLite Views and underlaying tables based off the PowerSync client side schema provided. Possibly, in some advanced cases users might configure their own tables outside of PowerSync - these tables won't be touched by PowerSync - this is not necessarily a use-case we'd currently want to optimize for.
All right. The schemaSource dance supports the rare cases where the user would like to define their own views.
- PowerSync should use managed leases to the connections in the DatabasePool by making requests to
read
/write
(writeWithoutTransaction).
Yeah, it's a write without transaction. I noticed that.
- It's assumed that it should be safe for other code to use the same DatabasePool's
read
andwrite
methods alongside PowerSync.
It should be, yes.
The update hook here is only registered inside a
writeWithoutTransaction
closure. Assuming no other external write operations could occur at this time? The updates detected by PowerSync are then relayed to observers with thenotifyChanges
API.
I understand that PowerSync compiles and executes statements from raw SQLite pointers, and the changes executed by those statements are not detected by GRDB TransactionObserver
, as documented.
Now, GRDB will generously install its own update hook, that will discard the one installed by PowerSync, at the first opportunity.
I don't have any ready-made solution at hand. Blame the SQLite API that does not allow registering several hooks.
The updates detected by PowerSync are then relayed to observers with the
notifyChanges
API.
This is a nice touch. You can even test this code, and assert that users can observe those changes.
But if you enrich your test suite and assert that users can observe other changes, you'll sooner or later discover that the PowerSync update hook has been removed, and that the PowerSync observation no longer works. Maybe you can try register a (I'm not sure this will work unless statements are compiled and run by GRDB).TransactionObserver
that reinstalls the PowerSync update hook in willCommit
and didRollback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for all the additional context and insight.
I mistakingly assumed that GRDB would register a new update hook inside a guarded write
closure - which would be blocked while PowerSync was using it's temporary update hook inside it's provided write
closure.
The context here was very valuable and guarded us from a tricky to debug bug in the future.
I've removed the use of a temporary Update hook now. The next easiest option from our side seems to be the SQLite Session API. I've implemented the majority of this logic on the Kotlin side of the SDK. PowerSync will now create a session whenever it receives a write connection. The changed tables will be determined and used to notify both GRDB observers and PowerSync observers. We'll still use GRDB to report changes originating from GRDB operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. In all cases, SQLite has one and single update hook, so any tool that does not allow sharing it between several observers will create problems with other tools. GRDB provides such sharing with TransactionObserver
(you can register multiple ones), but it is not a panacea either, because it only works well when statements are compiled and executed via GRDB APIs.
😬
Keeping the database private to PowerSync is surely the easiest way forward: preventing the app user from using GRDB observation guarantees that PowerSync can use whatever observation technique it wants without conflict.
And maybe I'm too cautious. Tests of expected scenarios can prove me wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this thoughtful feedback and for taking the time to guide us through these integration challenges. Your insights have been invaluable in helping us understand the nuances of GRDB's architecture.
You're absolutely right that GRDB's TransactionObserver
works best when statements are executed through GRDB's APIs, and I completely acknowledge that we are sidestepping those APIs. This approach is currently more convenient for integrating with our existing driver API requirements, though we may be able to update our core APIs in the future to enable a more natural integration.
We did actually test keeping the database connections separate in our very first POC investigations. However, our goals for the PowerSync integration led us toward the shared database approach:
Seamless bidirectional sync: We want to update a database as incoming changes are received while simultaneously detecting local changes for uploading. We need to monitor the local database for changes to trigger internal logic, while also ensuring our downloaded mutations propagate notifications to application observers.
Safe connection lease/locking: Background PowerSync operations shouldn't interfere with application code writes, and the DatabasePool
's connection management seemed like an elegant solution for this - preferable to relying solely on SQLite's built-in locking mechanisms.
While there are certainly alternatives, using the same database pool generally accommodates these requirements quite well.
For this initial alpha release, I think the most important criteria is ensuring PowerSync has a reliable method of being notified of local GRDB application edits - we can leverage TransactionObserver
for this. We also need a way to detect our own changes for reporting purposes - I believe the Session API, while heavier, should work for this use case without conflicting with TransactionObserver
.
I really appreciate your patience and expertise as we navigate these integration details. Your guidance has already saved us from potential debugging headaches down the road.
Overview
This builds off the amazing work of powersync-ja/powersync-kotlin#230. The linked PR adds support for creating a
PowerSyncDatabase
by supplying a customSQLiteConnectionPool
which manages SQLite connections.Some small additions and wip tests were added to the Kotlin implementation here
This PR adds Swift bindings to the Kotlin PR and then adds a
PowerSyncGRDB
Product to our SPM package. This product adapts a GRDB DatabasePool to aSQLiteConnectionPool
accepted by the PowerSync SDK.This makes use of the fact that GRDB exposes the raw C SQLite pointer for it's connections. We then use this pointer with the newly introduced native
Database
kotlin implementation.The
PowerSyncGRDB
product provides a means to generate a GRDBConfiguration
. This configuration is used when creating the GRDBDatabasePool
. We supply implementation for the configuration to load our PowerSync Rust core extension.APIs are still a WIP
Consumers can create a Pool with:
Consumers can then pass this pool when creating the PowerSyncDatabase.
Using the
DatabasePool
in the PowerSync SDK results in the same locking mechanisms being used between instances of thePowerSyncDatabase
andDatabasePool
. Consumers should be safe to alternate between both clients.Demo
A new GRDB demo application has been added to the
demos
folder. This demo uses GRDB queries in Views.grdbdemo.mp4
TODOs:
Implement LeaseAll to lease all GRDB connections if possible