As customers deploy containerized applications, the need for a common platform for stateful and stateless workloads is ever more paramount. It is imperative that containerized stateful applications such as MySQL, Kafka, Cassandra, and Couchbase benefit from auto-scaling, portability and container-level controls.
Containers natively supports one of the overlay file systems which operate by creating layers, making them lightweight and fast. For example, Docker Engine uses UnionFS to provide the building blocks for containers.
Containers’ layered storage implementation is designed for portability and speed. It is optimized for storing, retrieving, and transferring images across different environments. When a container is removed, all of the data written to the container is removed along with it. All of the state or logs will be lost along with the container.
According to some recent surveys, customers have consistently said persistent storage is one of the top pain points for container adoption.
Source: Cloud Foundry: Hope Versus Reality, Containers in 2016
Source: Container Market Adoption Survey 2016
Options to manage persistent data with containers
To benefit from portability by containerization, the best practice is to isolate the stateful data from the container. This way the container lifecycle and data management can be decoupled, and data durability can be achieved.
Host-based persistence: Host-based persistence had been adopted primarily by developers for data durability. The containers on a host depend upon the host’s local storage for persistence. The host filesystem is mapped to the container, making the host storage directly visible inside the container mount namespace. Multiple containers may share the same volume, however, the distributed application on these containers will have to manage the shared data access to prevent data corruption. These volumes may be explicitly bound to a container to prevent accidental sharing of data.
While this approach is simple, it fundamentally breaks the core premise of container portability. The container is bound to that host where the data is located. Moreover, it is also less reliable given that the host might fail, causing the data to be inaccessible.
Shared storage-based persistence: While shared storage is typical for most workloads in the industry, its adoption for containers is rather early. In this case, the volume on a shared storage device is exposed to the host, and that is exposed as a mount within one or more containers. This becomes especially useful when multiple containers need read-write access to the same directory. These nodes will be designated for scheduling containers that need long-term durability and persistence. Container orchestration engines, such as Kubernetes have evolved to provide a mechanism to specify hosts during the scheduling of containers. In Kubernetes, labels can be used to target a set of nodes when deploying pods. Kubernetes also utilizes Pet Sets, a group of stateful pods that have a stronger requirement for identity.
What makes Datera unique for Containers
Datera has developed a new data architecture from the ground up with the core tenant to decouple application provisioning from physical infrastructure management. Applications data should have zero knowledge of the underlying physical resources.
It is built for highly distributed environments, can be deployed and managed as software storage and is tightly integrated with modern container orchestrators (such as Kubernetes, Docker/Swarm, Mesos) through volume plugins.
Datera Elastic Data Fabric (EDF) supports the following operations for containers:
- Create and delete volumes – persistent volumes with the ability to define various durability, performance and security characteristics
- Unified namespace – cluster-wide namespace which is critical for containers and their mobility
- Launching containers – attaching persistent volumes as file system mounts to containers at launch
- Data services – ability to take snapshots, clones, and backups of the volumes
- Runtime controls – orchestrate container granular controls on the data volumes as the container lifecycle evolves.
Container solutions that utilize Datera EDF can benefit from the following value-additions:
- Software deployment – similar to deploying the operating systems on bare-metal servers, Datera EDF runs on industry standard x86 servers
- Any workloads with container-granular controls – leverages media advantages including NVRAM, NVMe, SATA SSD, SATA HDD and delivers quality of service with fine grain IOPS and bandwidth controls at per container level
- Symmetric control plane with unified namespace – the system provides a unified namespace that is crucial for containers’ mobility and auto-scalability
- Automation of storage scaling with app – Application templates based scaling storage to track auto-scaling containers
- Storage provisioning at container speed – 1000’s of data stores deployed in sub-second
- Container native monitoring and operations – Use your existing tools for monitoring and operations
- Complete operations offload by self-adaptive infrastructure – system self-tunes the data placement and tiering, predictive remediation, and remotely managed by Cloud-based Datera CloudOperations
Zeroing in on Datera EDF with CoreOS Tectonic
CoreOS Tectonic provides the key components to secure, simplify and automatically update enterprise container infrastructure. Datera has worked with CoreOS to embellish this infrastructure to deploy and upgrade the Datera volume plugin in a seamless and secure manner on all the Kubernetes nodes. This makes the kubernetes nodes, Datera storage ready, ready to accept provisioning requests on Datera EDF.
Using labels, PetSets and the persistent volume framework (using the persistent volume provisioner), Datera provides a systematic way to consistently provision and expose storage on the Kubernetes nodes.
For the Datera SDK and Volume Plugin, please go to https://github.com/Datera
For trying out the software or appliance Datera EDF, please contact firstname.lastname@example.org