Configuring scale bounds

To apply upper and lower bounds to autoscaling behavior, you can specify scale bounds in both directions.

Lower bound

This value controls the minimum number of replicas that each revision should have. Knative will attempt to never have less than this number of replicas at any one point in time.

  • Global key: n/a
  • Per-revision annotation key: autoscaling.knative.dev/minScale
  • Possible values: integer
  • Default: 0 if scale-to-zero is enabled and class KPA is used, 1 otherwise

NOTE: For more information about scale-to-zero configuration, see the documentation on Configuring scale to zero.

Example:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "3"
    spec:
      containers:
        - image: gcr.io/knative-samples/helloworld-go

Upper bound

This value controls the maximum number of replicas that each revision should have. Knative will attempt to never have more than this number of replicas running, or in the process of being created, at any one point in time.

  • Global key: n/a
  • Per-revision annotation key: autoscaling.knative.dev/maxScale
  • Possible values: integer
  • Default: 0 which means unlimited

Example:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/maxScale: "3"
    spec:
      containers:
        - image: gcr.io/knative-samples/helloworld-go