When we first deployed our backend onto the Linode Kubernetes Engine (LKE) with 3 nodes and fixed number of pods it was getting harder to scale up the Pods in k8s as the number of requests were increasing, so we had to look for a way to automatically scale up and down the pods based on the metrics like CPU metrics and we found our solution in HPA. In the next few paragraphs, We’ll discuss how to write the configuration and deploy to the kubernetes cluster.
When working with k8s you may want to dynamically scale up and down the pods based on certain metrics like CPU metrics, Memory metrics, etc. You can achieve this in Kubernetes(K8s) using the Horizontal Pod Autoscaler. The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on CPU utilization or some other metrics.
In this blog, we are going to take a look at how we can create a deployment and create the Horizontal Pod Autoscaler to scale up and down the deployment. Let’s get started by creating a simple deployment for the nginx app:
nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.9.2-alpine
ports:
- containerPort: 80
resources:
# You must specify requests for CPU to autoscale
# based on CPU utilization
requests:
cpu: "250m"
When the above configuration is applied then kubernetes creates a deployment with 2 pods and each pod runs the nginx container and requests 250 milliCPUs each. Note that a value for the CPU requests is required for the HPA to work.
Let’s apply the configuration using the following command:
kubectl apply -f nginx.yaml
We can create an HPA either using the kubectl autoscale command or by kubectl apply command.
To create the HPA using the kubectl autoscale command run the following command.
kubectl autoscale deployment nginx --min=2 --max=10 --cpu-percent=50
Syntax:
kubectl autoscale (-f FILENAME | TYPE NAME | TYPE/NAME) [--min=MINPODS] --max=MAXPODS [--cpu-percent=CPU]
To create the HPA using the configuration file create a nginx-hpa.yaml file with the following code:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 50
The above kubectl autoscale and nginx-hpa.yaml file both creates an HPA that scales up and down the nginx pods and ensures that there are minimum of 2 replicas and maximum of 10 replicas when the average cpu utilization is greater than 50%. Also, note that HorizontalPodAutoscaler is available in the API versions autoscaling/v1 and autoscaling/v2beta2.
To get a list of HPA object run the following command:
kubectl get hpa -n default
To delete the HPA object run the following command:
kubectl delete hpa nginx -n default
Conclusion
This way we can scale up and scale down the pods in the kubernetes cluster based on the CPU metrics and make the full use of all the nodes in the k8s cluster. In the next blog post, I’ll discuss how to scale up and down the nodes in the k8s cluster on the Linode Kubernetes Engine using the Linode API in python.
In order to give you better service we use cookies. By continuing to use our website, you agree to the use of cookies as described in our Privacy Policy