Turn a Kubernetes deployment into a Knative service

Discover great flexibility, easier deployments, and auto-scaling features by using Knative services.

Posted: September 23, 2020 by Tara Gu

How to turn a Kubernetes deployment into a knative service — Image by Foundry Co from Pixabay

Editor's note: This article introduces Knative on Kubernetes, the open source platform underneath Red Hat OpenShift Serverless. For more information on using OpenShift Serverless, please see this high level overview or the official documentation.

Knative is a framework that runs on Kubernetes and makes it easier to perform common tasks, such as deploying applications, scaling applications up and down, routing traffic, canary deployments, etc. Knative lets operators twist fewer knobs and thereby reduces the cost of operations when managing serverless environments and workloads (e.g., batch processing and event-driven applications). In this article, we will first deploy an application on Kubernetes, then turn the deployment into a Knative service to show a simpler experience.

Deploy a minimal application on Kubernetes

To deploy an application on Kubernetes, you'll need both a deployment manifest and a service manifest.

A Kubernetes deployment manifest is essentially the application's definition to the platform. Specifically, it tells Kubernetes how pods should be created (e.g., how much CPU or memory to allocate). Before creating the deployment, you first need to create the application's image. Please follow this guide to create and upload your application image to an image registry.

The bare minimal deployment below includes information such as the application image (spec.template.spec.containers[i].image), any environment variable you'd like to use in the application (spec.template.spec.containers[i].env), and how many pods to be created (spec.replicas):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-world-deployment
  labels:
    app: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  replicas: 2
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
      - name: hello-world
        image: docker.io/taragu/helloworld-go
        ports:
        - name: http
          containerPort: 8080
        env:
        - name: TARGET
          value: "Go Sample v1"

To create a deployment, save the above text into a file, then use kubectl apply to apply the file:

$ kubectl apply -f path_to_deployment.yaml

We can then see that two pods have been created and are running:

$ kubectl get pods
NAME                                      READY   STATUS    RESTARTS   AGE
hello-world-deployment-75c44bd567-5gcnm   1/1     Running   0          30s
hello-world-deployment-75c44bd567-v2ltn   1/1     Running   0          30s

As shown in the deployment YAML above, the application definition also contains information such as how the pods in this deployment should communicate with other pods in the cluster (spec.template.spec.containers[i].ports) and repeating app: hello-world labels throughout.

Another artifact we need to deploy a bare minimal application is a Kubernetes service (not to be confused with Knative service). By default, a Kubernetes deployment is only accessible within the cluster. To make the container accessible from outside the Kubernetes virtual network, we need to create a Kubernetes service. LoadBalancer is the standard way to expose a service. It does so by creating an external load balancer that points to the service. Below is a basic service to expose our deployment:

apiVersion: v1
kind: Service
metadata:
  name: hello-world
spec:
  selector:
    app: hello-world
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
    name: http

You can use kubectl to create this service:

$ kubectl apply -f path_to_k8s_svc.yaml

If we put the deployment YAML and the service YAML side by side, you can see that the spec.ports[i].targetPort in the service YAML matches up with spec.template.spec.containers[i].ports[i].containerPort in the deployment. Port 80 is how we access the service from outside of the cluster. The port is specified in the field spec.ports[i].port.

To access the service, first, find the external IP of the LoadBalancer service. You can find the external IP in the status.loadBalancer.ingress[0].ip field of the service definition. Get the value from this field with the following command:

$ export EXTERNALIP=$(kubectl get service hello-world -ocustom-columns=:.status.loadBalancer.ingress[0].ip --no-headers)

Then curl the external IP with the port number:

$ curl $EXTERNALIP:80
Hello Go Sample v1!

How to create a service on Knative

Knative is a framework that builds on Kubernetes and a service mesh such as Istio to support deploying and serving serverless applications and functions. It provides a much simpler way of deploying and scaling Kubernetes applications. It offers out-of-the-box request-based scaling and scales to zero for idle applications.

You can either use a managed Knative service (e.g., managed OpenShift Serverless, IBM Cloud Code Engine, Google Cloud Run) or install one yourself by following this guide to install the Knative Serving component on a Kubernetes cluster.

You can deploy a Knative service through kn, a Knative client that allows deploying an application without a single line of YAML. You can install kn by following this guide. To create a Knative service through kn, pass in the application image, environment variables, and other configurations you would specify in a YAML:

$ kn service create helloworld-go --image docker.io/taragu/helloworld-go --env TARGET= "Go Sample v1"

As an alternative, you can still use kubectl to deploy a Knative service. To create a Knative service YAML, all you have to do is to take the spec.template.spec.containers (minus ports) of the Kubernetes deployment YAML, and put it under spec.template.spec of the Knative YAML:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    spec:
      containers:
        - image: docker.io/taragu/helloworld-go
          env:
            - name: TARGET
              value: "Go Sample v1"

You can create this Knative service with kubectl the same way you create any Kubernetes resources:

$ kubectl apply -f path_to_ksvc.yaml

We can see that one pod has been created and is running:

$ kubectl get pods
NAME                                             READY   STATUS    RESTARTS   AGE
helloworld-go-cdqm7-deployment-75b4dc8c9-xxzcw   2/2     Running   0          11s

Notice that in the second column, the number of containers ready is 2, instead of 1 for the Kubernetes deployment. Besides the user container, Knative also deploys a sidecar container, queue-proxy (you can see more information about the two containers with kubectl describe pod pod-name). The queue-proxy collects metrics on the user container and caps the number of allowed concurrent requests per container.

To access this service, find the external IP of the Istio ingress gateway service. You can find it in the status.loadBalancer.ingress[0].ip field of the service definition and you get the value from this field with the following command:

$ export INGRESSIP=$(kubectl get svc istio-ingressgateway -n istio-system -ocustom-columns=:.status.loadBalancer.ingress[0].ip --no-headers)

Then curl the ingress external IP:

$ curl -H "Host: helloworld-go.default.example.com" $INGRESSIP

If you are using a managed Knative service, you would be able to use a real URL instead of having to discover the ingress IP.

As we can see, Knative eliminates the need to specify the repeating labels and port information. Furthermore, instead of manually setting the number of replicas as shown in the Kubernetes service YAML, we can leverage Knative's autoscaling capability to do request-based scaling. Even Kubernetes has out-of-the-box CPU-based autoscaling, or users can configure scaling based on custom metrics. However, Knative's request-based autoscaling works out-of-the-box without extra setup.

You can read about the different configurations for Knative autoscaling here. For example, you can set the "hard" concurrency limit (specifies the exact number of requests that are allowed to flow to the replica at any one time) with the containerConcurrency spec:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    spec:
      containerConcurrency: 1
      containers:
        - image: docker.io/taragu/helloworld-go
          env:
            - name: TARGET
              value: "Go Sample v1"

You can update this Knative service with kubectl the same way you update any Kubernetes resources:

$ kubectl apply -f path_to_ksvc.yaml

When you send bursts of requests, the number of pods increases:

$ for i in `seq 10`; do
    curl -H "Host: helloworld-go.default.example.com" $INGRESSIP&
done
$ kubectl get pods
NAME                                              READY   STATUS    RESTARTS   AGE
helloworld-go-vtth2-deployment-55f9dd79fd-47b4q   2/2     Running   0          14s
helloworld-go-vtth2-deployment-55f9dd79fd-7lm64   2/2     Running   0          14s
helloworld-go-vtth2-deployment-55f9dd79fd-fhz2j   2/2     Running   0          14s
helloworld-go-vtth2-deployment-55f9dd79fd-ghg76   2/2     Running   0          16s
helloworld-go-vtth2-deployment-55f9dd79fd-pp6jf   2/2     Running   0          14s
helloworld-go-vtth2-deployment-55f9dd79fd-pxg4r   2/2     Running   0          14s
helloworld-go-vtth2-deployment-55f9dd79fd-thsrx   2/2     Running   0          14s
helloworld-go-vtth2-deployment-55f9dd79fd-vj77c   2/2     Running   0          16s

Another feature that comes with Knative out-of-the-box is scale to zero. After a certain time period has passed, if the application does not receive any traffic, Knative terminates all pods. Use kubectl get pods -w to watch the pods terminate:

$ kubectl get pods -w
NAME                                              READY   STATUS    RESTARTS   AGE
helloworld-go-vtth2-deployment-55f9dd79fd-47b4q   2/2     Running   0          61s
helloworld-go-vtth2-deployment-55f9dd79fd-7lm64   2/2     Running   0          61s
helloworld-go-vtth2-deployment-55f9dd79fd-fhz2j   2/2     Running   0          61s
helloworld-go-vtth2-deployment-55f9dd79fd-ghg76   2/2     Running   0          63s
helloworld-go-vtth2-deployment-55f9dd79fd-pp6jf   2/2     Running   0          61s
helloworld-go-vtth2-deployment-55f9dd79fd-pxg4r   2/2     Running   0          61s
helloworld-go-vtth2-deployment-55f9dd79fd-thsrx   2/2     Running   0          61s
helloworld-go-vtth2-deployment-55f9dd79fd-vj77c   2/2     Running   0          63s
helloworld-go-vtth2-deployment-55f9dd79fd-ghg76   2/2   Terminating   0     68s
helloworld-go-vtth2-deployment-55f9dd79fd-fhz2j   2/2   Terminating   0     66s
helloworld-go-vtth2-deployment-55f9dd79fd-pxg4r   2/2   Terminating   0     66s
helloworld-go-vtth2-deployment-55f9dd79fd-47b4q   2/2   Terminating   0     66s
helloworld-go-vtth2-deployment-55f9dd79fd-thsrx   2/2   Terminating   0     68s
helloworld-go-vtth2-deployment-55f9dd79fd-pp6jf   2/2   Terminating   0     68s
helloworld-go-vtth2-deployment-55f9dd79fd-7lm64   2/2   Terminating   0     70s
helloworld-go-vtth2-deployment-55f9dd79fd-vj77c   2/2   Terminating   0     74s

Summary

Knative simplifies application deployment. You can deploy an application without a single line of YAML with kn, and there are fewer configuration options that users need to set compared to configuring a Kubernetes deployment and service.

Knative also simplifies application scaling, with out-of-the-box scale-to-zero and request-based configuration options. It has many other capabilities that we haven't explored in this article, such as traffic splitting and building event drive applications with Knative eventing. If you are interested in learning more about Knative, please check out their official docs site as well as GitHub repositories.

[ Are you a current RHCSA looking to learn more about Kubernetes? RHCSAs are eligible for 50% off online containers, Kubernetes, and Red Hat OpenShift training through the end of 2020. ]