Skip to content

Conversation

@dayeol
Copy link
Contributor

@dayeol dayeol commented Nov 5, 2024

2/N PR for a major refactoring of DCR Monitor. See #15.

1/N: #16

Changes

  • Remove "UpdateJobStatus" endpoint from the API. Everything must be handled in the reconciler instead.
  • Added imagebuilder interface, which currently has one image builder implementation which is Kaniko
  • Moved RunJob logic into the reconciler. Now TEE is launched by the reconciler via LaunchInstance method in TEEProvider.
  • Removed bunch of shallow interfaces from Config. Removed bunch of unnecessary global configs, and made them local.
  • Completely removed KMS logic. We don't need user keys and stage1/stage2 keys in the open source version
  • Completely removed PrepareResourcesForUser. Any logic needed for TEE should go to TEEProvider.
  • Removed project_number from config, because it is not needed anymore.
  • Removed logic for creating/updating workload identity pool provider. It's not used any longer so let's revisit later when we implement data confidentiality.

Caveat

  • This still doesn't support getting attestation report.

Testing

Added unit tests, and also did manual end-to-end testing.

hlog.Infof("[KanikoJobMonitor]job name: %v, job status: %v", k8sJob.Name, k8sJob.Status.Conditions[0].Type)

if k8sJob.Status.Conditions[0].Type == batchv1.JobComplete {
image, digest, err := b.getImageDigestAndDeleteJob(k8sJob.Name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about devide the getImageDigestAndDeleteJob function into two functions. Something like getImageDigest and deleteJob

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

OIDC OIDC `json:"oidc"`
}

// FIXME: duplicated from pkg/cloud
Copy link
Collaborator

@privacywill privacywill Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean we should remove the duplicated struct in the pkg/cloud

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have completely deduplicated it by removing PrepareResourcesForUser.

Copy link
Collaborator

@privacywill privacywill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks your work!

return req
}

func (c *TEEProviderGCPConfidentialSpace) createWorkloadIdentityPoolProvider(name string, digest string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workload identity pool is used for allowing the confidential space instance to use the kms key. Since the kms key has been totally removed, we should think whether the workload identity is required any more.

@dayeol dayeol merged commit 775a0a2 into main Nov 6, 2024
1 check passed
@dayeol dayeol deleted the dayeol-refactor-reconciler-2 branch November 6, 2024 21:32
@dayeol dayeol restored the dayeol-refactor-reconciler-2 branch November 9, 2024 00:44
@dayeol dayeol deleted the dayeol-refactor-reconciler-2 branch November 9, 2024 00:46
dayeol added a commit that referenced this pull request Nov 13, 2024
2/3 PR for a major refactoring of DCR Monitor. See
#15.

1/3: #16
2/3: #19

This pretty much concludes the major refactoring, resolving #15

# Changes

* Added `Created` job status, because the reconciler needs to handle
newly created jobs.
* Move KanikoService functionality into the monitor (=reconciler)
* Remove unused config `Cluster.PodServiceAccount`
* Clean up ARG/ENV of TEE dockerfile
* In-memory Dockerfile injection in API. See `addDockerfileToTarGz`.
This resolves #3. Also Kaniko can just refer to the remote object path.
* Removed most of shallow functions in `Config`
* Completely removed `CloudProvider`
* Removed shallow / no-longer used modules in `pkg`:
`pkg/utils/constant.go`, `pkg/utils/file.go`, and `pkg/utils/k8s.go`.

# Testing

Manual end-to-end testing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants