paint-brush
Getting Started with Kubernetes Persistent Volumesby@gilad-david-maayan
1,100 reads
1,100 reads

Getting Started with Kubernetes Persistent Volumes

by Gilad David MaayanFebruary 26th, 2022
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Company Mentioned

Mention Thumbnail
featured image - Getting Started with Kubernetes Persistent Volumes
Gilad David Maayan HackerNoon profile picture


What Are Kubernetes Persistent Volumes?

Ordinarily, containers in a Kubernetes cluster are immutable. This means that when you shut down a container, any information stored within it is lost. This is suitable for stateless applications, but in order to run stateful applications like databases in Kubernetes, you will need persistent storage.


To enable persistent storage that survives the shutdown of a container or pod, Kubernetes provides a persistent volumes subsystem. It provides API objects to let you allocate and manage storage resources persistently.

You can set up persistent storage using the following API resources:

  • PersistentVolume (PVs)—each PV represents a storage resource available within the cluster. Unlike ephemeral volumes, the lifecycle of a PV is independent of any pod using it. This ensures that the storage remains persistent. Administrators can provision PVs manually. Kubernetes can also dynamically provision PVs using storage classes.
  • PersistentVolumeClaims (PVCs)—each PVC represents a request for storage. A PVC can consume only a PV resource that meets its size and access modes requirements. If a relevant PV is available, it is bound to the PVC. Otherwise, Kubernetes dynamically provisions a PV or the request fails.


For more background see this blog about Kubernetes Persistent Volumes.

Kubernetes Storage Concepts

Kubernetes Storage Class

Kubernetes lets administrators define StorageClasses that are abstracted from the storage implementation. It enables users to request the specific StorageClass relevant to each PVC without specifying all storage requirements.


Administrators can create various types of StorageClasses, each defined differently for the provisioner, parameters, and reclaimPolicy. It lets you set up StorageClasses that provide different performance, such as HDD or SSD, costs, and applications.

The provisioner lets you tie a StorageClass to specific backends, such as AWS EBS or Azure Files, and a storage endpoint. Additional parameters are available according to the chosen provisioner to help you describe the volumes for each StorageClass using parameters like type, zone, server, or path.

Kubernetes StatefulSet

A Kubernetes StatefulSet is a workload API object for managing stateful applications. It lets you define a set of pods with unique network identifiers to be deployed and scaled in a specific order. StatefulSets help manage persistent storage across a Kubernetes cluster.


Pods associated with a StatefulSet are assigned a unique identity comprised of stable storage, an ordinal, and a stable network identity. This identifier sticks to the pods and the StatefulSet maintains it across scheduling and rescheduling. Learn more in this blog about Kubernetes networking.

Kubernetes CSI Plugins

Container Storage Interface (CSI) plugins help make third-party storage available for your Kubernetes cluster. It helps introduce persistent storage to containerized applications. CSI plugins receive PVC requests that specify the third-party vendor as a provisioner and then create and mount the requested volume.

Lifecycle of PVs and PVCs

A PVC is a request for a PV that exists as a storage resource within a Kubernetes cluster. A PVC is tied to a matching PV to ensure storage remains persistent. As a result, the lifecycle of PVs and PVCs is usually tightly connected.

Provisioning

Kubernetes provides the following types of provisioning for persistent storage:


  • Static—this type of provisioning is available for cluster administrators. It enables admins to manually create PV resources and make them available to all cluster users. Once the PV exists within the Kubernetes API, users can create a PVC request for a pod to consume a PV resource.
  • Dynamic—this type of provisioning is available for a Kubernetes cluster. It enables the cluster to automatically provision a PV to match a PVC according to predefined StorageClasses. Dynamic provisioning typically occurs when non of the PVs statically provisioned match a PVC.

Binding

A control loop monitors new PVCs. Once a user creates a new PVC, the control loop searches for a marching PV that meets the PVC requirements. It looks to satisfy the requested amount of storage and may do so using a PV that exceeds the request. Here is how this process works:


  • If the control loop finds a matching PV, it binds it to the PVC. Binding ensures that the PV and PVC remain exclusive.
  • If the control loop cannot find a PV that meets the PVC requirements for access mode and storage size, it dynamically provisions a new PV according to the StorageClass and binds it to the PVC.
  • If no PV is available for a PVC—static or dynamic—the PVC remains unbound indefinitely.

Using

Kubernetes pods use PVCs as volumes. Users can specify an access mode for the PVC, and the cluster mounts the bound PV to the pod. When using volumes that support multiple access modes, the user must specify the desired mode. Once a user gets a bound PVC, the bound PV also belongs to that user.

Reclaiming

Once a user no longer needs to use a volume, it can be deleted or retained. Users can delete a PVC object directly from the API that enables them to reclaim this resource. Otherwise, the PV’s reclaim policy defines what the cluster should do with a volume that was released of its claim.

Quick Tutorial: Working with Persistent Volumes

This tutorial is adapted from the full persistent storage tutorial in the Kubernetes documentation.


Before starting this tutorial, set up a Kubernetes cluster on your local machine with only one node. Ensure that kubectl is installed on your machine and communicating successfully with the cluster. You can easily create a one-node cluster with Minikube.

Step 1: Create a File On the Node

Because we are working with persistent storage, we need to create a static file and demonstrate that this file persists even after we shut down a node.


Open a shell to your node, create a directory named /mnt/data, and inside it create an index.html file with the text “Hello World Persistent Volume”.

Step 2: Create a Persistent Volume

To create a persistent volume, open a text editor and copy-paste the YAML code below.


apiVersion: v1

kind: PersistentVolume

metadata:

  name: my-pv-volume

  labels:

    type: local

spec:

  storageClassName: manual

  capacity:

    storage: 6Gi

  accessModes:

   —ReadWriteOnce

  hostPath:

    path: "/mnt/data"


Save the YAML file as my-pv-volume.yamlin your local directory. Deploy the persistent volume by running this command:

kubectl create -f my-pv-volume.yaml

Step 3: Create a Persistent Volume Claim

Create a PVC specification by opening your text editor and copying the YAML code below:


apiVersion: v1

kind: PersistentVolumeClaim

metadata:

  name: my-pv-claim

spec:

  storageClassName: manual

  accessModes:

   —ReadWriteOnce

  resources:

    requests:

      storage: 2Gi


Save the file in your local directory as my-pv-claim.yaml. Deploy it using this command:**

kubectl apply -f my-pv-claim.yaml

Step 4: Create a Pod and Mount the PVC as a Volume

Now it gets interesting—let’s create a pod that mounts this PVC so it can make use of our persistent volume. Copy the following code to your text editor:


apiVersion: v1

kind: Pod

metadata:

  name: my-pv-pod

spec:

  volumes:

   —name: my-pv-storage

      persistentVolumeClaim:

        claimName: my-pv-claim

  containers:

   —name: my-pv-container

      image: nginx

      ports:

       —containerPort: 80

          name: "http-server"

      volumeMounts:

       —mountPath: "/usr/share/nginx/html"

          name: my-pv-storage


Notice that the pod specifies the PVC, it does not directly reference the PV. Kubernetes is responsible for matching this pod with a PV that meets its requirements, based on the information in the PVC. In our case it is simple because there is only one PV.


Save the pod spec in your local directory as my-pv-pod.yaml. Deploy the pod by running this command:


kubectl apply -f my-pv-pod.yaml


Ensure the pod is running:


kubectl get pod my-pv-pod

Step 5: Verify that the Pod Has Access to Persistent Storage

Let’s see if our pod can access the file we saved in persistent storage at the beginning of the tutorial. The important thing here is that any pod you create using this specification will have access to the same storage—even after you shut down or delete the pod, the file will persist.


Run this command to bash into the container:


kubectl exec -it my-pv-pod -- /bin/bash


Once you are running in the root shell of your container, run these commands:


apt update

apt install curl

curl http://localhost/


You should see the content of your index.html files:


Hello World Persistent Volume


Congrats! You just created a pod that has access to a Persistent Volume via a Persistent Volume Claim.

Conclusion

In this article, I explained the basics of Kubernetes persistent storage, explained the concepts of Persistent Volumes and Persistent Volume Claims, and showed how to set up a quick demo of a pod that mounts a PVC:


1. Create a one-node cluster using Minikube.

2. Create an HTML file on the node.

3. Create a Persistent Volume YAML file and deploy it using kubectl.

4. Create a Persistent Volume Claim YAML file and deploy it using kubectl.

5. Create a Pod YAML which mounts the PVC as a volume, and deploy it using kubectl.

6. Bash into the container and use curl to view the HTML file from step 2—this proves that your container has access to a persistent volume.


I hope this will be useful as you take your first steps in Kubernetes storage management.