- Summary
- Motivation
- Design
- Proposed roadmap
- Graduation Criteria
- Implementation History
- Alternatives
- Appendix
- Move the Ingress resource from the current API group (extensions.v1beta1) to networking.v1beta1.
- Graduate the Ingress API with bug fixes to GA.
The extensions
API group is considered deprecated. Ingress is the
last non-deprecated API in that group. All other types have been
migrated to other permanent API groups. Such an API group migration
takes three minor version cycles (~9 months) to ensure
compatibility. This means any API group movement should be started
sooner rather than later.
The Ingress resource has been in a beta state for a long time (first commit was in Fall 2015). While the interface is not perfect, there are many independent implementations in active use.
We have a couple of choices (and non-choices, see appendix) for the current resource:
-
We can delete the current resource from extensions.v1beta1 in anticipation that an improved API can replace it.
-
We can copy the API as-is (or with minor changes) into networking.v1beta1, preserving/converting existing data (following the same approach taken with all other extensions.v1beta1 resources). This will allow us to start the cleanup of the extensions API group. This also prepares the API for GA.
Option 1 does not seem realistic in a short-term time frame (a new API will need to be taken through design, alpha/beta/ga phases). At the same time, there are enough users that the existing API cannot be deleted out right.
In terms of moving the API towards GA, the API itself has been available in beta for so long that it has attained defacto GA status through usage and adoption (both by users and by load balancer / ingress controller providers). Abandoning it without a full replacement is not a viable approach. It is clearly a useful API and captures a non-trivial set of use cases. At this point, it seems more prudent to declare the current API as something the community will support as a V1, codifying its status, while working on either a V2 Ingress API or an entirely different API with a superset of features.
A detailed list of the changes being proposed is given in the Design section below.
- Move Ingress to a permanent API group. (status: implemented)
- Make changes to the Ingress API be in a GA-ready state. (status:
proposal).
- Clean up the Ingress API (fix ambiguities, API spec bugs).
- Promote commonly supported annotations to proper API fields.
- Create a suite of conformance tests to validate existing implementations.
- Make Ingress GA. (status: proposal).
This section describes the API fixes proposed for GA.
- Add path as a prefix and make regex support optional. The current spec states that the path is a regular expression, but support for the flavor defined in the spec varies across providers. In addition, regex matching is not supported by many popular provider implementations.
- Fix API field naming:
spec.backend
should be calledspec.defaultBackend
.
- Hostname wildcard matching. We currently allow for creation of
*.foo.com
and this seems to be a commonly supported host match, but this is not part of the spec. - Formalize the Ingress class annotation into a field and an associated
IngressClass
resource. - Add support for non-Service Backend types.
These are features that were discussed but not part of this discussion:
- (POST GA) Specify healthcheck behavior and be able to configure the healthcheck path and timeout.
- (POST GA) Improve the Ingress status field to be able to include additional information. The current status currently only contains the provisioned IP address(es) of the load balancer.
The current APIs state that the path is a regular expression using the POSIX IEEE Std 1003.1 standard. However, this is not consistent with the syntax supported by any of the common proxy vendors:
Platform | Syntax |
---|---|
nginx | PCRE |
haproxy | PCRE/PCRE2 |
envoy | ECMAscript |
skipper | re2 |
Among cloud providers, there is also inconsistent levels of support for regular expression-based path matching. See the load-balancer documentation for AWS, GCP, Azure, Skipper.
It is also the case that our documentation (and most
Ingress providers) treats the path match as a prefix match. For
example, a narrow interpretation of the specification would require
all paths to end with ".*$"
.
A detailed discussion of this issue can be found here.
- Explicitly state the match mode of the path.
- Support the existing implementation-specific behavior.
- Support a portable prefix match and future expansion of behavior.
Add a field ingress.spec.rules.http.paths.pathType
to indicate
the desired interpretation of the meaning of the path
:
type HTTPIngressPath struct {
...
// Path to match against. The interpretation of Path depends on
// the value of PathType.
//
// Defaults to "/" if empty.
//
// +Optional
Path string
// PathType determines the interpretation of the Path
// matching. PathType can be one of the following values:
//
// Exact - matches the URL path exactly.
//
// Prefix - matches based on a URL path prefix split
// by '/'. [insert description of semantics described below]
//
// ImplementationSpecific - interpretation of the Path
// matching is up to the IngressClass. Implementations
// are not required to support ImplementationSpecific matching.
//
// +Optional
PathType string
...
}
V1 validation
Note: default value are permitted between API versions (reference).
The PathType
field will default to a value of ImplementationSpecific
to
provide backwards compatibility.
For Prefix
and Exact
paths:
- Let
[p_1, p_2, ..., p_n]
be the list of Paths for a specific host. - Every Path
p_i
must be syntactically valid:- Must begin with the
'/'
character (relative paths are not allowed by RFC-7230). - Must not contain consecutive
'/'
characters (e.g./foo///
,//
).
- Must begin with the
- For prefix paths, a trailing
'/'
character in the Path is ignored, e.g./abc
and/abc/
specify the same match. - If there is more than one potential match:
Exact
match is preferred to aPrefix
match.- For multiple prefix matches, the longest Path
p_i
will be the matching path. - If an
ImplementationSpecific
match exists in the spec, then the preference depends on the implementation.
- If there is no matching path, then the
defaultBackend
for the host will be used. - If there is not a match for the host, then the overall
defaultBackend
for the Ingress will be selected.
Path must be exactly the same as the request path.
Matching is done on a path element by element basis. A path element refers is
the list of labels in the path split by the '/'
separator. A request is a
match for path p
if every p
is an element-wise prefix of p
of the request
path. Note that if the last element of the path is a substring of the last
element in request path, it is not a match (e.g. /foo/bar
matches
/foo/bar/baz
, but does not match /foo/barbaz
).
Interpretation of the implementation-specific behavior is defined by the
associated IngressClass
. Implementations are not required to support this type
of match. If the match type is not supported, then the controller MAY raise this
error as an asynchronous Event to the user.
Kind | Path(s) | Request path(s) | Matches? |
---|---|---|---|
Prefix | / |
(all paths) | Yes |
Exact | /foo |
/foo |
Yes |
Exact | /foo |
/bar |
No |
Exact | /foo |
/foo/ |
No |
Exact | /foo/ |
/foo |
No |
Prefix | /foo |
/foo , /foo/ |
Yes |
Prefix | /foo/ |
/foo , /foo/ |
Yes |
Prefix | /aaa/bb |
/aaa/bbb |
No |
Prefix | /aaa/bbb |
/aaa/bbb |
Yes |
Prefix | /aaa/bbb/ |
/aaa/bbb |
Yes, ignores trailing slash |
Prefix | /aaa/bbb |
/aaa/bbb/ |
Yes, matches trailing slash |
Prefix | /aaa/bbb |
/aaa/bbb/ccc |
Yes, matches subpath |
Prefix | /aaa/bbb |
/aaa/bbbxyz |
No, does not match string prefix |
Prefix | / , /aaa |
/aaa/ccc |
Yes, matches /aaa prefix |
Prefix | / , /aaa , /aaa/bbb |
/aaa/bbb |
Yes, matches /aaa/bbb prefix |
Prefix | / , /aaa , /aaa/bbb |
/ccc |
Yes, matches / prefix |
Prefix | /aaa |
/ccc |
No, uses default backend |
Mixed | /foo (Prefix), /foo (Exact) |
/foo |
Yes, prefers Exact |
These are straightforward one-to-one renames for better semantic meaning.
v1beta1 field | v1 | rationale |
---|---|---|
spec.backend |
spec.defaultBackend |
Explicitly mentions default |
Add comment clarifying behavior:
It is up to the controller to resolve conflicts between the defaultBackend's for multiple Ingress definitions that are served from the same VIP if this is possible.
Most platforms support wildcards for host names, e.g. syntax such as
*.foo.com
matches names app1.foo.com
, app2.foo.com
. The current
spec states that spec.rules.host
must be an exact FQDN match of a
network host.
Add support for a single wildcard *
as the first label in the hostname.
The IngressRule.Host
specification would be changed to:
Host
can be "precise" which is an domain name without the terminating dot of a network host (e.g. "foo.bar.com") or "wildcard", which is a domain name prefixed with a single wildcard label (e.g."*.foo.com"
).Requests will be matched against the
Host
field in the following way:If
Host
is precise, the request matches this rule if the http host header is equal toHost
.If
Host
is a wildcard, then the request matches this rule if the http host header is to equal to the suffix (removing the first label) of the wildcard rule.
- The wildcard character
'*'
must appear by itself as the first DNS label and matches only a single label.- You cannot have a wildcard label by itself (e.g.
Host == "*"
).
"*.foo.com"
matches"bar.foo.com"
because they share an the same suffix"foo.com"
."*.foo.com"
does not match"aaa.bbb.foo.com"
as the wildcard only matches a single label."*.foo.com"
does not match"foo.com"
, as the wildcard must match a single label.
Note: label refers to a "DNS label", i.e. the strings separated by the dots "." in the domain name.
As this is strictly additive, this could be punted to post-GA to reduce the size of the change.
The kubernetes.io/ingress.class
annotation is required for selecting between
multiple Ingress providers. As support for this annotation is universal, this
concept should be promoted to an actual field.
Although promoting the annotation as it is currently defined as an opaque string
is the most direct path, that precludes any future enhancements to the concept.
With that in mind, we propose creating a new Class
field in IngressSpec
to
take the place of the existing annotation. This new field will be immutable. To
ensure that this can be safely round tripped between API versions, this new
field will also be added to previous API versions.
type IngressSpec struct {
...
// Class is the name of the IngressClass cluster resource. This defines which
// controller(s) will implement the resource.
// +optional
Class *string
...
}
The kubernetes.io/ingress.class
annotation will be separate from the new Class
field. As the annotation was never formally defined or validated, we can not
safely convert the value of this annotation to the new Class field. Use of this
annotation will be considered formally deprecated with the v1 Ingress release.
When both the class field and annotation are set, the annotation will take priority. The controller MAY emit a warning event if the user sets conflicting (different) values for the annotation and field.
To ensure backwards compatibility, Ingresses may be created without a class field or annotation. Although this is not a recommended state, it must still be supported by the API. Implementations of this API may choose to ignore Ingresses without a class specified. In certain cases, such as when an Ingress class is marked as default, it may make sense for Ingress implementations to implement Ingresses that do not have a class specified.
When new Ingresses are created, if both the class field and annotation are set, an error will be returned that only the class field should be used. If the class field is set, a corresponding IngressClass resource must also exist.
Additionally, we propose adding an IngressClass
resource to provide additional
data about the Ingress class. This will be a non-namespaced resource. The
IngressClass resource is an optional way to provide additional configuration for
a specific class of Ingress. This allows us to evolve the API to express
concepts such as levels of service associated with a given Ingress controller.
The name of this IngressClass resource will be tied to any Ingresses with the same value for the class field. An IngressClass resource can exist without any Ingresses referencing it, and an Ingress can have a class value that does not correspond with an IngressClass resource.
// IngressClass represents the class of the Ingress, referenced by the Ingress
// Spec.
type IngressClass struct {
metav1.TypeMeta
metav1.ObjectMeta
// Spec is the desired state of the IngressClass.
// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
// +optional
Spec IngressClassSpec
}
// IngressClassSpec provides information about the class of an Ingress.
type IngressClassSpec struct {
// Controller is responsible for handling this class. This should be
// specified as a domain-prefixed path, e.g. "acme.io/ingress-controller".
// This allows for different "flavors" that are controlled by the same
// controller. For example, you may have different Parameters for the same
// implementing controller.
Controller string
// Parameters is a link to a custom resource configuration for the
// controller. This is optional if the controller does not require extra
// parameters.
// +optional
Parameters *api.TypedLocalObjectReference
}
Following the pattern established by StorageClass, an annotation can be set on
an IngressClass to indicate that the IngressClass should be considered default.
This ingressclass.kubernetes.io/is-default-class
will accept a boolean value.
When set to true, new Ingress resources without a class specified will be
assigned this Ingress class. If more than one IngressClass resource has this
annotation, the admission controller will return an error in response to Ingress
creation attempts that don't have a class specified.
The Ingress resource is an L7 description of a composite set of services. It currently supports only Kubernetes Services as a backends. However, there are many use cases where a portion of the HTTP requests could be routed to a different kind of resource. For example, serving content from an object storage (S3, GCS) is a commonly requested feature.
At the same time, we do not expect to enumerate all possible backends that could arise, nor do we expect that naming of the resources will be uniform in schema, parameters etc. Similarly, many of the resources will be implementation-specific.
Add a field to the IngressBackend
struct with an object reference:
type IngressBackend struct {
// Only one of the following fields may be specified.
// Service references a Service as a Backend. This is specially
// called out as it is required to be supported AND to reduce
// verbosity.
// +optional
Service *ServiceBackend
// Resource is an ObjectRef to another Kubernetes resource in the namespace
// of the Ingress object.
// +optional
Resource *v1.TypedLocalObjectReference
}
// ServiceBackend references a Kubernetes Service as a Backend.
type ServiceBackend struct {
// Service is the name of the referenced service. The service must exist in
// the same namespace as the Ingress object.
// +optional
Name string
// Port of the referenced service. If unspecified and the ServiceName is
// non-empty, the Service must expose a single port.
// +optional
Port ServiceBackendPort
}
// ServiceBackendPort is the service port being referenced.
type ServiceBackendPort struct {
// Number is the numerical port number (e.g. 80) on the Service.
Number int
// Name is the name of the port on the Service.
Name string
}
Support for non-Service
type Resource
s is
implementation-specific. Implmentations MUST support Kubernetes
Service. Support for other types is OPTIONAL.
Ingress routing everything to foo-app
:
kind: Ingress
spec:
class: acme-lb
backend:
service:
name: foo-app
port:
number: 80
Ingress routing everything to the ACME storage bucket:
kind: Ingress
spec:
class: acme-lb
backend:
resource:
apiGroup: acme.io/networking
kind: storage-bucket
name: foo-bucket
Invalid configuration (uses both resource and service):
kind: Ingress
spec:
class: acme-lb
backend:
service:
name: foo-app
port:
number: 80
resource: # INVALID!
apiGroup: acme.io/networking
kind: storage-bucket
name: foo-bucket
As a sketch, an object bucket can be named with a CRD. NOTE: this example is non-normative and for illustration purposes only.
type Bucket struct {
metav1.TypeMeta
metav1.ObjectMeta
Spec BucketSpec
}
type BucketSpec struct {
Bucket string
Path string
}
The associated IngressBackend
referencing the bucket would be:
backend:
resource:
apiGroup: bucket.io
kind: bucket
name: my-bucket
- Copy the Ingress API to
networking.k8s.io/v1beta1
(preserving existing data and round-tripping with the extensions Ingress API, following the approach taken for all otherextensions/v1beta1
resources). - Develop a set of planned changes and GA graduation criteria with sig-network (intent is to target a minimal set of bugfixes and non-breaking changes)
- Announce
extensions/v1beta1
Ingress as deprecated (and announce plan for GA)
- Copy existing Ingress tests, changing the resource type to the new group. Keep existing tests as is.
- Update API server to persist in networking.k8s.io/v1beta1 kubernetes/kubernetes#77139
- Update in-tree controllers, examples, and clients to target kubernetes/kubernetes#77617
networking.k8s.io/v1beta1
- Update Ingress controllers in the kubernetes org to target
networking.k8s.io/v1beta1
- Update documentation to recommend new users start with kubernetes/website#14239
networking.k8s.io/v1beta1, but existing users stick with
extensions/v1beta1
untilnetworking.k8s.io/v1
is available. - Update documentation to reference
networking.k8s.io/v1beta1
kubernetes/website#14239
- Meet graduation criteria and promote API to
networking.k8s.io/v1
- Implement API changes to GA version.
- Announce
networking.k8s.io/v1beta1
Ingress as deprecated
- Update API server to persist in
networking.k8s.io/v1
. - Update in-tree controllers, examples, and clients to target
networking.k8s.io/v1
. - Update Ingress controllers in the kubernetes org to target
networking.k8s.io/v1
. - Update documentation to reference
networking.k8s.io/v1
. - Evangelize availability of v1 Ingress API to out-of-org Ingress controllers
- Remove ability to serve
extensions/v1beta1
andnetworking.k8s.io/v1beta1
Ingress resources (preserve ability to read existingextensions/v1beta1
Ingress objects from storage and serve them via thenetworking.k8s.io/v1
API)
- 1.14: Ingress API exists and has parity with existing
extensions/v1beta1
API - 1.14:
extensions/v1beta1
Ingress tests are replicated againstnetworking.k8s.io
- 1.15: all in-tree use and in-org controllers switch to
networking.k8s.io
API group - 1.15: documentation and examples are updated to refer to
networking.k8s.io API group
networking.k8s.io/v1
- 1.17: API finalized and implemented on the branch.
- 1.XX: Ingress spec and conformance tests finalized and running against branch.
- 1.XX: Review & update Ingress documentation
- 1.XX: API changes merged into the main API, with tests from v1beta1 pointing to GA.
- 1.14: Copied Ingress API to the networking API group.
See motivation section.
- Kubecon EU 2019 sig-network meetup.
One suggestion was to move the API into a new API group, defined as a CRD. This does not work because there is no way to do round-trip of existing Ingress objects to a CRD-based API.
The current spec does not have any provisions to customize
healthchecks for referenced backends. Many users already have a
healthcheck URL that is lightweight and different from the HTTP root
(i.e. /
).
One obvious question that arises is why the Ingress healthcheck configuration is (a) is needed and (b) is different from the current Pod readiness and liveness checks. The Ingress healthcheck represents an end-to-end check from the proxy server to the backend. The Kubelet-based service health check operates only within the VM and does not include the network path. A minor point is that it is also the case that some providers require a healthcheck to be specified as part of load balancing.
An option that has been explored is to infer the healthcheck URL from the Readiness/Liveness probes on the Pods of the Service. This method has proven to be unworkable: Every Pod in a Service can have a different Readiness probe definition and therefore it's not clear which one should be used. Furthermore, the behavior is implicit and creates action-at-a-distance relationship between the Ingress and Pod resources.
Add the following fields to IngressBackend
:
type IngressBackend struct {
...
// Healthcheck defines custom healthcheck for this backend.
// +optional
Healthcheck *IngressBackendHealthcheck
}
type IngressBackendHealthcheck struct {
// HTTP defines healthchecks using the HTTP protocol.
HTTP *IngressBackendHTTPHealthcheck
}
// IngressBackendHTTPHealthcheck is a healthcheck using the HTTP protocol.
type IngressBackendHTTPHealthcheck struct {
// Host header to send when healthchecking. If empty, the host header will be
// implementation specific.
Host string
// Path to use for the HTTP healthcheck. If empty, the root '/' path will be
// used for healthchecking.
Path string
// TimeoutSeconds for the healthcheck. Failure to respond with a success code
// within TimeoutSeconds will be counted towards the FailureThreshold.
TimeoutSeconds int
// FailureThreshold is the number of consecutive failures necesseary to
// indicate a backend failure.
FailureThreshold int
}
If Healthcheck
is nil, then the implementation default healthcheck will be
configured, healthchecking the root /
path. If Healthcheck
is specfied,
then the backend health will be checked using the parameters listed above.
Note: these items are NOT the main focus of this KEP, but recorded here for reference purposes. These items came up in discussions on the KEP (roughly sorted by practicality):
- Spec path as a prefix, maybe as a new field
- Rename
backend
todefaultBackend
or something more obvious - Be more explicit about wildcard hostname support (I can create *.bar.com but in theory this is not supported)
- Add health-checks API
- Specify whether to accept just HTTPS or also allow bare HTTP
- Better status
- Formalize Ingress class
- Reference a secret in a different namespace? Use case: avoid copying wildcard certificates (generated with cert-manager for instance)
- Add non-required features (levels of support)
- Some way to have backends be things other than a service (e.g. a GCS bucket)
- Some way to restrict hostnames and/or URLs per namespace
- HTTP to HTTPS redirects
- Explicit sharing or non-sharing of external IPs (e.g. GCP HTTP LB)
- Affinity
- Per-backend timeouts
- Backend protocol
- Cross-namespace backends
This section contains rejected design proposals for future reference.
The safest route for specifying the regex would be to state a limited subset that can be used in a portable way. Any expressions outside of the subset will have implementation specific behavior.
Regular expression subset (derived from re2 syntax page)
Expression | description |
---|---|
. |
any character |
[xyz] |
character class |
[^xyz] |
negated character class |
x* |
0 or more x's |
x+ |
1 or more x's |
xy |
x followed by y |
`x | y` |
(abc) |
grouping |
Maintaining a regular expression subset is not worth the complexity and is likely impossible across the many implementations.