Table of Contents
In the ever-evolving landscape of container orchestration, Kubernetes (K8s) has emerged as the gold standard for managing and scaling containerized applications. At the heart of every K8s cluster lies a critical component known as etcd. etcd is a distributed key-value store that stores and manages all of the K8s' configuration data, ensuring the system's reliability and consistency.
While K8s provides a robust platform for deploying and managing applications, the need to safeguard the etcd data cannot be overstated. This is where the importance of taking regular backups comes into play.
In this article, we'll dive into the essential part of etcd backup in Kubernetes, understanding why it's crucial for the stability and recoverability of your cluster.
The Relationship between Kubernetes and etcd
At the core of Kubernetes, etcd — an open-source distributed key-value store that acts as Kubernetes' primary database for storing configuration data and ensuring cluster consistency.
Etcd serves as the single source of truth, storing information about the cluster's state, configuration, and secrets. Kubernetes components, including the API server, controller manager, and scheduler, rely heavily on etcd to synchronize and manage containerized workloads across the cluster.
This tight integration makes etcd indispensable in maintaining the stability and reliability of a Kubernetes cluster, underlining the need for regular backups to safeguard this vital component.
Why is it crucial to take a backup of your Kubernetes cluster?
Taking regular backups of etcd in the Kubernetes cluster is crucial for several reasons, as it ensures the reliability, recoverability, and security of your K8s cluster. Here are key points explaining why regular etcd backups are essential:
- Data Recovery: In the event of data loss or cluster-wide failures, etcd backups serve as a lifeline to restore your K8s cluster to a previously known state. This minimizes downtime and ensures business continuity.
- Configuration History: Etcd stores the entire configuration history of your K8s cluster. Regular backups provide a historical record of changes, enabling you to trace and understand configuration modifications and troubleshoot issues over time.
- Rollback and Versioning: Etcd backups enable you to roll back to previous cluster configurations or versions, which is essential for testing new configurations or reverting to a stable state in case of issues with updates or changes.
Before you learn how to take a backup of the etcd cluster, ensure you have the following prerequisites:
- A Kubernetes Cluster using Kubeadm
For demo purposes, I used the Killerkoda Kubernetes playground.
To communicate with etcd, you’ll need etcdctl, a command line utility for communicating with the etcd database, as it comes with the Kubeadm cluster by default.
etcdctl supports two versions of the etcd server's API. When making server calls, it defaults to version 2 of the API. In version 2, some operations are either undefined or have different arguments.
Next, you will tell
etcdctl to use the V3 API, which is required for the snapshot functionality.
ETCDCTL_API to VERSION 3
etcdctl use the V3 API; you can either set the environment variable with each call as in the following commands.
$ ETCDCTL_API=3 etcdctl snapshot save ... $ ETCDCTL_API=3 etcdctl snapshot restore ...
or the entire terminal session.
$ export ETCDCTL_API=3 $ etcdctl snapshot save ... $ etcdctl snapshot restore ...
How to Backup your Kubernetes etcd Data
To take a backup of the etcd database, you run the following command:
$ etcdctl snapshot save
For executing this operation, you’ll need a few flags (arguments) of certificates, which are mandatory for verification of the etcd server. This is because you must authenticate with the etcd server before it will expose its sensitive data. The authentication scheme is called Mutual TLS (mTLS).
To learn more about the flags, run:
$ etcdctl snapshot save -h
The output of the above command should look like this:
You’ll need 4 important arguments to successfully backup etcd:
- --endpoints (Optional)
Let’s look into these arguments, what they are, and why you should pass them.
This provides the path to the Certificate Authority (CA). The CA certificate is used to verify the authenticity of the TLS certificate sent to
etcdctl by the etcd server. The server's certificate found must be signed by the CA. Creating the CA is one of the tasks you need to do when building a cluster. Kubeadm does it automatically.
This is the path to the TLS certificate that
etcdctl sends to the etcd server. The etcd server will verify that this certificate is also signed by the same CA. Certificates of this type contain a public key that can be used to encrypt data. The public key is used by the server to encrypt data being sent back to
etcdctl during the authentication steps.
This is the path to the private key that is used to decrypt data sent to
etcdctl by the etcd server during the authentication steps. The key is only used by the
etcdctl process. It is never sent to the server.
4. --endpoints (optional)
--endpoints argument on
etcdctl is used to tell it where to find the etcd server. If you are running the command on the same host where etcd service is running and there is only one instance of etcd, then you do not need to provide this argument, as it has a default value of
If your etcd service is running on the different port you need to provide that different port number instead of
If your etcd service is running on the remote host then you need to pass -
Where to find the values of these arguments?
As etcd is running as a pod in the Kubernetes namespace called
kube-system. You can describe the same pod, and you will able to see all the arguments and their values.
$ kubectl describe -n kube-system pod etcd-controlplane
As this contains a lot of information that we don't need right now, we can use
grep command to extract only what we need.
$ kubectl describe -n kube-system pod etcd-controlplane | grep -i file
As you can observe here the path of these all certificates is at the location
/etc/kubernetes/pki/etcd so you can find them as well from controlplane node.
The Final backup command will be:
$ ETCDCTL_API=3 etcdctl snapshot save \ --cacert /etc/kubernetes/pki/etcd/ca.crt \ --cert /etc/kubernetes/pki/etcd/server.crt \ --key /etc/kubernetes/pki/etcd/server.key \ /opt/etcd-backup.db
/opt/etcd-backup.db is the path for storing etcd backup data.
You should see output similar to this
Restoring from a backup
Normally you will restore this to another directory, and then point the
etcd service at the new location. For restores, the certificate and endpoints arguments are not required, as we are doing creating files in directories and not talking to the
etcd API, so the only argument required is
--data-dir to tell
etcdctl where to put the restored files.
$ etcdctl snapshot restore -h
You can pass any value as the path to the argument
-- data-dir .
The final restore command will be:
$ ETCDCTL_API=3 etcdctl snapshot restore \ --data-dir /var/lib/etcd-from-backup \ /opt/etcd-backup.db
The above command will output the following:
This article described how you can take a backup of etcd in the Kubernetes cluster and restore it safely to avoid data loss and cluster-wide failures.
There is much more to learn about Kubernetes and etcd. Check out the following resources to explore more:
The Practical DevOps Newsletter
Your weekly source of expert tips, real-world scenarios, and streamlined workflows!