-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Issue
The objects are passed via cloud storage (e.g., S3 or GCS).
This requires the uploader and downloader have an agreement on the URL.
Also, both of them needs to be given some permanent permissions, making overall access control hard to manage.
Proposal
Instead, we could make it simpler by using signed URLs.
For example, when a TEE instance finishes, it can upload its data to a gcs bucket, then sign the URL to the object and put it in the database.
With this approach, the frontend doesn't need to know anything about the gcs directory structure. Also the permission control is already being handled by the API.
There are a few places we would like to replace download/upload with signed URLs.
frontend uploading workspace tar file
- jupyterlab server submits a job
- the API returns with a PUT-signed URL to upload workspace (expire in 1 hour)
- jupyterlab uploads the workspace tar
- the Reconciler waits until the object becomes available
- the Reconciler starts kaniko service with gcs URL (e.g.,
--context=gs://<internal object address>)
frontend downloading output of execution
- the API generates a PUT-signed URL for output file when building the image (expire in job timeout = 6 hours)
- the API pass it as a
build-arg - after TEE runs, it will upload the output file to the signed URL
- when jupyterlab server send download job output request, the API returns the GET-signed URL (expire in 1 hour)
- the frontend downloads the job output via the URL
frontend getting attestation report
Same as job output, but as a different build-arg.
Benefit
This design is beneficial in many ways:
- API is solely responsible for managing objects and the permissions.
- Object URLs are completely abstracted away from all other parts. No need to pass configs around (e.g., bucket, directory, object names, ...)
- No complicated IAM policy needed for managing different writer/reader roles
- Permissions are temporary, and is fully controlled by the API, instead of permanent permission given by terraform
- No need for unnecessary file copies between frontend <> API and API <> reconciler.
Changes Needed
- Add new columns to
Jobfor signed URLs. - Interface to generate PUT- and GET-signed URLs for arbitrary object
- Pass the signed URL when initiating a Image build job.
- Have TEE instance to upload the output to the signed URL.
- Change file upload/download in the frontend with signed URLs.