|
| 1 | +# The `HTTPScaledObject` |
| 2 | + |
| 3 | +>This document reflects the specification of the `HTTPScaledObject` resource for the `v0.8.0` version. |
| 4 | +
|
| 5 | +Each `HTTPScaledObject` looks approximately like the below: |
| 6 | + |
| 7 | +```yaml |
| 8 | +kind: HTTPScaledObject |
| 9 | +apiVersion: http.keda.sh/v1alpha1 |
| 10 | +metadata: |
| 11 | + name: xkcd |
| 12 | +spec: |
| 13 | + hosts: |
| 14 | + - myhost.com |
| 15 | + pathPrefixes: |
| 16 | + - /test |
| 17 | + scaleTargetRef: |
| 18 | + name: xkcd |
| 19 | + kind: Deployment |
| 20 | + apiVersion: apps/v1 |
| 21 | + service: xkcd |
| 22 | + port: 8080 |
| 23 | + replicas: |
| 24 | + min: 5 |
| 25 | + max: 10 |
| 26 | + scaledownPeriod: 300 |
| 27 | + scalingMetric: # requestRate and concurrency are mutually exclusive |
| 28 | + requestRate: |
| 29 | + granularity: 1s |
| 30 | + targetValue: 100 |
| 31 | + window: 1m |
| 32 | + concurrency: |
| 33 | + targetValue: 100 |
| 34 | +``` |
| 35 | +
|
| 36 | +This document is a narrated reference guide for the `HTTPScaledObject`, and we'll focus on the `spec` field. |
| 37 | + |
| 38 | +## `hosts` |
| 39 | + |
| 40 | +These are the hosts to apply this scaling rule to. All incoming requests with one of these values in their `Host` header will be forwarded to the `Service` and port specified in the below `scaleTargetRef`, and that same `scaleTargetRef`'s workload will be scaled accordingly. |
| 41 | + |
| 42 | +## `pathPrefixes` |
| 43 | + |
| 44 | +>Default: "/" |
| 45 | + |
| 46 | +These are the paths to apply this scaling rule to. All incoming requests with one of these values as path prefix will be forwarded to the `Service` and port specified in the below `scaleTargetRef`, and that same `scaleTargetRef`'s workload will be scaled accordingly. |
| 47 | + |
| 48 | +## `scaleTargetRef` |
| 49 | + |
| 50 | +This is the primary and most important part of the `spec` because it describes: |
| 51 | + |
| 52 | +1. The incoming host to apply this scaling rule to. |
| 53 | +2. What workload to scale. |
| 54 | +3. The service to which to route HTTP traffic. |
| 55 | + |
| 56 | +### `deployment` (DEPRECTATED: removed as part of v0.9.0) |
| 57 | + |
| 58 | +This is the name of the `Deployment` to scale. It must exist in the same namespace as this `HTTPScaledObject` and shouldn't be managed by any other autoscaling system. This means that there should not be any `ScaledObject` already created for this `Deployment`. The HTTP Add-on will manage a `ScaledObject` internally. |
| 59 | + |
| 60 | +### `name` |
| 61 | + |
| 62 | +This is the name of the workload to scale. It must exist in the same namespace as this `HTTPScaledObject` and shouldn't be managed by any other autoscaling system. This means that there should not be any `ScaledObject` already created for this workload. The HTTP Add-on will manage a `ScaledObject` internally. |
| 63 | + |
| 64 | +### `kind` |
| 65 | + |
| 66 | +This is the kind of the workload to scale. |
| 67 | + |
| 68 | +### `apiVersion` |
| 69 | + |
| 70 | +This is the apiVersion of the workload to scale. |
| 71 | + |
| 72 | +### `service` |
| 73 | + |
| 74 | +This is the name of the service to route traffic to. The add-on will create autoscaling and routing components that route to this `Service`. It must exist in the same namespace as this `HTTPScaledObject` and should route to the same `Deployment` as you entered in the `deployment` field. |
| 75 | + |
| 76 | +### `port` |
| 77 | + |
| 78 | +This is the port to route to on the service that you specified in the `service` field. It should be exposed on the service and should route to a valid `containerPort` on the `Deployment` you gave in the `deployment` field. |
| 79 | + |
| 80 | +### `targetPendingRequests` (DEPRECTATED: removed as part of v0.9.0) |
| 81 | + |
| 82 | +>Default: 100 |
| 83 | + |
| 84 | +This is the number of _pending_ (or in-progress) requests that your application needs to have before the HTTP Add-on will scale it. Conversely, if your application has below this number of pending requests, the HTTP add-on will scale it down. |
| 85 | + |
| 86 | +For example, if you set this field to 100, the HTTP Add-on will scale your app up if it sees that there are 200 in-progress requests. On the other hand, it will scale down if it sees that there are only 20 in-progress requests. Note that it will _never_ scale your app to zero replicas unless there are _no_ requests in-progress. Even if you set this value to a very high number and only have a single in-progress request, your app will still have one replica. |
| 87 | + |
| 88 | +### `scaledownPeriod` |
| 89 | + |
| 90 | +>Default: 300 |
| 91 | + |
| 92 | +The period to wait after the last reported active before scaling the resource back to 0. |
| 93 | + |
| 94 | +> Note: This time is measured on KEDA side based on in-flight requests, so workloads with few and random traffic could have unexpected scale to 0 cases. In those case we recommend to extend this period to ensure it doesn't happen. |
| 95 | + |
| 96 | + |
| 97 | +## `scalingMetric` |
| 98 | + |
| 99 | +This is the second most important part of the `spec` because it describes how the workload has to scale. This section contains 2 nested sections (`requestRate` and `concurrency`) which are mutually exclusive between themselves. |
| 100 | + |
| 101 | +### `requestRate` |
| 102 | + |
| 103 | +This section enables scaling based on the request rate. |
| 104 | + |
| 105 | +> **NOTE**: Requests information is stored in memory, aggragating long periods (longer than 5 minutes) or too fine granularity (less than 1 second) could produce perfomance issues or memory usage increase. |
| 106 | + |
| 107 | +> **NOTE 2**: Although updating `window` and/or `granularity` is something doable, the process just replaces all the stored request count infomation. This can produce unexpected scaling behaviours until the window is populated again. |
| 108 | + |
| 109 | +#### `targetValue` |
| 110 | + |
| 111 | +>Default: 100 |
| 112 | + |
| 113 | +This is the target value for the scaling configuration. |
| 114 | + |
| 115 | +#### `window` |
| 116 | + |
| 117 | +>Default: "1m" |
| 118 | + |
| 119 | +This value defines the aggregation window for the request rate calculation. |
| 120 | + |
| 121 | +#### `granularity` |
| 122 | + |
| 123 | +>Default: "1s" |
| 124 | + |
| 125 | +This value defines the granualarity of the aggregated requests for the request rate calculation. |
| 126 | + |
| 127 | +### `concurrency` |
| 128 | + |
| 129 | +This section enables scaling based on the request concurrency. |
| 130 | + |
| 131 | +> **NOTE**: This is the only scaling behaviour before v0.8.0 |
| 132 | + |
| 133 | +#### `targetValue` |
| 134 | + |
| 135 | +>Default: 100 |
| 136 | + |
| 137 | +This is the target value for the scaling configuration. |
0 commit comments