-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pause-resume-frozen functionality #168
Conversation
@artgromov Among the checking the core content of the PR, please pay attention to how I implemented the change in Storage webhook. To check that the Storage has no active Databases, a simplest solution would be to just query k8s api server from a webhook. But I feel it is an EXTREMELY BAD idea to query the k8s api server from a webhook (can explain in detail if necessary), that's why I keep the info about databases in the Storage.Status.ConnectedDatabases and update it from the Database controller. I don't see any downsides in this approach (we spend additionally a bit of cpu cost and a roundtrip to api server on every database reconcile), but the webhook has much smaller chance of failing, and has one less roundtrip to api server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this PR introduces unnecessary complexity to the controller's logic. Here are two point that can be improved:
-
Changing replicas to 0, instead of deleting a StatefulSet object
It will achieve the same result and avoiding destructive operations that could lead to data loss. For examplepersistentVolumeReclaimPolicy
setting may be set by client toDelete
, which is default when using dynamic provisioning. -
Checking dependent databases when deleting Storage using a List operation
Using the.status
field in the Storage object to store a list of Databases may overload the object with unnecessary information and affect cluster performance. Instead of this you can perform a List operation in webhook with usingFieldSelector
for the.spec.storageRef
field using indexing as described here
api/v1alpha1/database_types.go
Outdated
// `Running` means the default state of the system, all Pods running. | ||
// +kubebuilder:default:=Running | ||
// +optional | ||
Pause string `json:"pause,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think better to use two different parameters for frozen and pause features, because it's orthogonal to each other. Also I think usage of string variable is not simple and clear
Therefore I suppose to Pause
parameter should implement logic to force unscale/delete StatefulSet and be bool variableand for example operatorSync
parameter should implement logic to disable operator Reconcile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you underestimate the complexity that comes from introducing 2 orthogonal fields, and they are NOT completely orthogonal. If we do as you suggest, we will have 4 states: (non-frozen, non-paused)
, (frozen, non-paused)
, (non-frozen, paused)
and (frozen, paused)
. I'm not sure if one of these states - (frozen, paused)
- is useful. If we introduce these 2 arguments, we will have to process 6 different transitions (number of edges in the graph with 4 vertices), and this will bring a lot of complexity into the codebase. We may additionally discuss this later in voice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frozen
parameter does not correlate with state of Storage/Database and it's dependent objects. What do I mean by that? The main goal of this feature that we can manual edit StatefulSet/ConfigMap/Service and another dependent objects with sure that ydb-operator do not override our changes. This operation is needed for example during maintenance and migration actions. During this period, the YDB engineer assumes control of the Storage/Database (logical state of cluster without changes of .status fields) under his responsibility. We can not stop StatefulSet (or Service for example) k8s controller, but client can use some special k8s tricks like orphan=true
parameter to prevent cascade deletion or kubectl replace
to change spec with deletion of object. In that case we can only guarantee that 1) ydb-operator reconcile disabled and does not sync CR and dependent objects 2) no one parameter can be changed while Reconcile function is disabled and using webhook to verify that (we can edit CR resources only when state Frozen
set to false)
In opposite Pause
parameter correlate with state of Storage/Database and it's dependent objects. The main goal of this feature that we can do force stop processes without additional checks. This operation is needed ,for example, during ad-hoc situations and cluster monstrous problems. Also we should change object's state to Paused
which indicate that it's desired state when no one process is running. Additional function of this field is prevent deletion of Storage/Database object (we can delete CR resources only when state Pause
set to false + for Storage we should additional check existing of dependent Database objects). To prevent data loss and inconsistency state we should implement finalizer to Pods with mandatory CMS approve before killing process (please look at YDBOPS-6003).
api/v1alpha1/storage_types.go
Outdated
State string `json:"state"` | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` | ||
State constants.ClusterState `json:"state"` | ||
ConnectedDatabases []ConnectedDatabase `json:"connectedDatabases,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YDB as multitenant system should manage hundreds or thousands databases. Should we store information about all of this in single Storage object? It may unreadable for engineer whom describe this object. Also it would be additional load to k8s apiserver and etcd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did not think of scaling there, you are right. I think I made a mistake when I implemented the push
model - I mean, when the Database pushes the changes about itself to the Storage, every database downloads, reads and sends back the whole list of databases connected to its Storage.
This can be mitigated by moving to pull
model, where the storage controller would list the databases in the cluster once per some timeout, say, 10s. Or I could investigate watch
api and support some caching in the way that kopf
does.
But I still think this knowledge is necessary for a Storage (to prevent pause or deletion when databases are active, please look at YDBOPS-8652), and it is not something that should be queried from a webhook. However, if I don't come up with a better solution, I might impement your List
suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not something that should be queried from a webhook
Can you explain why? I think webhook is the right place where this logic should be. Also we already using k8s client implementation inside monitoring admission webhook (ref code)
api/v1alpha1/storage_webhook.go
Outdated
if r.Spec.Pause == PausedState { | ||
for _, database := range storage.Status.ConnectedDatabases { | ||
if database.State == DatabaseReady { | ||
return fmt.Errorf("can not set Pause for Storage which has running Databases: %s", database.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main goal of this feature it's ability to force and fast delete all Storage pods. There is no need here to check databases and enable additional dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need additional triage from @artgromov here. Should we really kick the Storage pods from under a working Database? Wouldn't it lead to a lot of undefined behaviours after the Storage comes back online? Maybe I just misunderstood the task, if we don't have to guarantee anything, sure, I can remove this check
internal/controllers/storage/sync.go
Outdated
storage *resources.StorageClusterBuilder, | ||
) (bool, ctrl.Result, error) { | ||
r.Log.Info("running step checkStorageFrozen for Storage") | ||
if storage.Status.State != StorageFrozen && storage.Spec.Pause == FrozenState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we should set a new type of state object when disabling Reconcile? This information relates to ydb-operator functions, not to the state of the YDB Storage. In my opinion, it is better only to use a special constant name to describe the paused condition field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue it IS important for the user to see that his Storage\Database is Frozen. I thought about this state as just about a convenience feature for the user, and it's essentially free. Otherwise, the user might forget to un-set the Frozen parameter, then be surprised because operator has not reacted to any changes for the past N hours. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer using check in webhook to verify that we editing CR resources only when state Frozen
set to false
. That also prevent changes and should be clear to see. I don't like idea of saving previous state of object in .status
field and trying to revert to this state when unfroze process. I believe that Reconcile and Sync function should be completely execute with minimum conditional operators
We have decided:
and natural flow pending -> initializing -> ... -> ready. Additionally, as for deletion protection, I will implement the List operation based on the indexer - should not require passing much info in the claster in the backgroud and should keep webhook blazingly fast |
Hey! The suggestions have been implemented, the tests have been adjusted. The only thing NOT implemented is the deletion protection logic (if Storage has active Databases, the deletion of this Storage is not allowed). I had some skill issues implementing that using a watcher. Since this is not the core issue of this PR, let me bring that in a small follow-up PR a bit later. One more thing - since we still CAN NOT run tests with webhooks enabled (I honestly don't remember the reason, will have to look into it deeper), I have tested the webhook functionality manually. Since there is a lot of new logic about |
While at it, I made some small adjustments to logging here and here - I have removed some redundant logging. This error is logged when the SelfCheck is not connected to the Storage\Database yet (meaning that until Storage\Database are Ready, the operator spams logs like no tomorrow). I have simply reduced logging the same thing 3 times to logging it once, a bit reducing the spam factor. |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Pull request type
Please check the type of change your PR introduces:
What is the current behavior?
Issue Number: YDBOPS-8460
What is the new behavior?
Storage and Database CRDs are getting a new
.spec.Pause
field with possible valuesPaused | Running | Frozen
.When Database's
.spec.pause
is set, StatefulSet is deleted, but Database object is still persisted.When Database's
.spec.pause
is unset (fromPaused
toRunning
), StatefulSet is recreated, tenant is not re-initialized because the state of the tenant is persistent on the drivesStorage behaves in the same way, except it is NOT possible to pause a Storage which has active (non-paused) databases (as I see it, this does not make much sense)
Storage and Database resources can only be deleted when the Status is Paused
When Storage and Database have
.spec.Pause
equal toFrozen
, the reconcile loop of the operator for the particular object is effectively off. Operator will no longer create\modify\delete resources. Notice that does not apply to manual modifications to Pods, which are under control of StatefulSet controller.Added e2e-tests to cover the above functionality
Other information
internal/controllers/constants.go
package has been introduced because access to condition names and Storage\Database state names is necessary in many different places now (in webhooks, tests, in monitoring controllers, in the code fromresource
folder, and instorage
`database` controllers themselves, phew).