OpenEBS Documentation

What is OpenEBS?#

OpenEBS turns any storage available to Kubernetes worker nodes into Local or Replicated Kubernetes Persistent Volumes. OpenEBS helps application and platform teams easily deploy Kubernetes stateful workloads that require fast and highly durable, reliable, and scalable Container Native Storage.

OpenEBS is also a leading choice for NVMe-based storage deployments.

OpenEBS was originally built by MayaData and donated to the Cloud Native Computing Foundation.

Why do users prefer OpenEBS?#

The OpenEBS Adoption stories mention the top reasons driving users towards OpenEBS as:

OpenEBS provides consistency across all Kubernetes distributions - On-premise and Cloud.
OpenEBS with Kubernetes increases Developer and Platform SRE Productivity.
OpenEBS scores in its ease of use over other solutions. It is trivial to setup, install and configure.
OpenEBS has Excellent Community Support.
OpenEBS is completely Open Source and Free.

What does OpenEBS do?#

OpenEBS manages the storage available on each of the Kubernetes nodes and uses that storage to provide Local or Replicated Persistent Volumes to Stateful workloads.

data-engines-comparision

In the case of Local Volumes:

OpenEBS can create persistent volumes or use sub-directories on Hostpaths or use locally attached storage or sparse files or over existing LVM or ZFS stack.
The local volumes are directly mounted into the Stateful Pod, without any added overhead from OpenEBS in the data path, decreasing latency.
OpenEBS provides additional tooling for local volumes for monitoring, backup/restore, disaster recovery, snapshots when backed by LVM or ZFS stack, capacity-based scheduling, and more.

In the case of Replicated Volumes:

OpenEBS Replicated Storage creates an NVMe target accessible over TCP, for each persistent volume.
The Stateful Pod writes the data to the NVMe-TCP target that synchronously replicates the data to multiple nodes in the cluster. The OpenEBS engine itself is deployed as a pod and orchestrated by Kubernetes. When the node running the Stateful pod fails, the pod will be rescheduled to another node in the cluster and OpenEBS provides access to the data using the available data copies on other nodes.
OpenEBS Replicated Storage is developed with durability and performance as design goals. It efficiently manages the compute (hugepages and cores) and storage (NVMe Drives) to provide fast block storage.

note

OpenEBS contributors prefer to call the Distributed Block Storage volumes as Replicated Volumes, to avoid confusion with traditional block storage for the following reasons:

Distributed block storage tends to shard the data blocks of a volume across many nodes in the cluster. Replicated volumes persist all the data blocks of a volume on a node and for durability replicate the entire data to other nodes in the cluster.
While accessing volume data, distributed block storage depends on metadata hashing algorithms to locate the node where the block resides, whereas replicated volumes can access the data from any of the nodes where data is persisted (a.k.a replica nodes).
Replicated volumes have a lower blast radius compared to traditional distributed block storage.
Replicated volumes are designed for Cloud Native stateful workloads that require a large number of volumes with capacity that can typically be served from a single node as opposed to a single large volume with data sharded across multiple nodes in the cluster.

OpenEBS Data Engines and Control Plane are implemented as micro-services, deployed as containers and orchestrated by Kubernetes itself. Importantly, OpenEBS data engines are implemented in user space, allowing OpenEBS to run on any Kubernetes Platform and to use any type of storage available to Kubernetes worker nodes. An added advantage of being a completely Kubernetes native solution is that administrators and developers can interact and manage OpenEBS using all the wonderful tooling that is available for Kubernetes like kubectl, Helm, Prometheus, Grafana, etc.

Local Volumes#

Local Volumes are accessible only from a single node in the cluster. Pods using local volume have to be scheduled on the node where volume is provisioned. Local volumes are typically preferred for distributed workloads like Cassandra, MongoDB, Elastic, etc that are distributed in nature and have high availability built into them.

Replicated Volumes#

Replicated Volumes, as the name suggests, are those that have their data synchronously replicated to multiple nodes. Volumes can sustain node failures. The replication also can be set up across availability zones helping applications move across availability zones.

Replicated Volumes also are capable of enterprise storage features like snapshots, clone, volume expansion and so forth. Replicated Volumes are a preferred choice for Stateful workloads like Percona/MongoDB, Jira, GitLab, etc.

info

Depending on the type of storage attached to your Kubernetes worker nodes and the requirements of your workloads, you can select from Local Storage or Replicated Storage.

Quickstart Guide#

Installing OpenEBS in your cluster is as simple as running a few kubectl or helm commands. Refer to our Quickstart guide for more information.

Community Support via Slack#

OpenEBS has a vibrant community that can help you get started. If you have further questions and want to learn more about OpenEBS, join the OpenEBS community on Kubernetes Slack. If you are already signed up, head to our discussions at#openebs channel.