A look at Kubernetes Operator Implementation at Licious — Part 1

Implementing Kubernetes Operators for Streamlined Application Management

Saksham Yadav
Licious Technology

--

In today’s rapidly evolving tech landscape, staying ahead of the curve is paramount. At Licious, we embarked on a journey to streamline our operations and enhance efficiency by leveraging Kubernetes operators. This is the story of our operator implementation journey and the transformative impact it had on our workflow.

Understanding Kubernetes Operators

Before delving into our implementation journey, let’s first understand what Kubernetes operators are and why they’re gaining traction in the industry.

If you’ve used Kubernetes for any length of time, you’re likely familiar with the core resource types like Deployments, Services, ConfigMaps and more. But have you explored the powerful abstraction layer provided by Operators?

Kubernetes operators are software extensions that automate the management of complex, stateful applications on Kubernetes. They encapsulate operational knowledge about specific applications and use that knowledge to automate deployment, scaling, backup, and other management tasks. By encoding best practices and domain-specific knowledge into operators, organizations can streamline operations, reduce manual intervention, and improve the reliability of their applications.

Operators are the path to building truly cloud-native applications on Kubernetes by encoding operational knowledge and best practices into custom controllers. They allow you to extend the Kubernetes API with higher-level, domain-specific abstractions tailored to your applications.

Kubernetes Operators are similar to having an automated Site Reliability Engineer (SRE) devoted to administering your application in the world of cloud-native applications. They ensure that your applications function reliably and seamlessly throughout their whole lifecycle by encoding the operational knowledge and experience of seasoned administrators directly into software.

Imagine you have a critical database cluster running in your Kubernetes environment. Traditionally, you’d need a team of skilled database administrators to handle tasks like initial configuration, scaling, backups, upgrades, failure recovery, and more. With an Operator, this operational expertise is baked into the software itself. An operator is familiar with all the finer points of running a certain application. A cluster of any application, such as a monitoring stack like Prometheus, a distributed messaging system like Kafka, or a database like PostgreSQL, can be installed, configured, and run continuously by an Operator using declarative specifications.

More Formally, “Operators are software extensions that use custom resources to manage applications and their components“ Operators provides us a single view of the service in the form of custom resource instead of a collection of primitives (such as Pods, Deployments, Services, or ConfigMaps).In order to do these things, the Operator uses Custom Resources (CR) that define the desired configuration and state of a specific application through Custom Resource Definitions (CRD).The Operator’s role is to reconcile the actual state of the application with the state desired by the CRD using a control loop in which it can automatically scale, update, or restart the application. Practically speaking, Kubernetes offers basic commands, primitives, that can be used by Operators to define more complex actions.

Why Use Operators?

There are several key benefits and use cases that make writing and using Operators worthwhile:

Automation and Scalability — Operators automate the management of complex applications on Kubernetes clusters. They handle tasks such as deployment, scaling, upgrading, and failure recovery, reducing the need for manual intervention. This automation enables efficient scalability as your application workload grows or changes.

Lifecycle Management — Operators provide full lifecycle management capabilities for applications or workloads. They can handle tasks from basic installation to seamless upgrades, full lifecycle management including backup and recovery, and even deep insights into the application’s health and performance.

Customization and Flexibility — Operators are customizable according to the specific requirements of your application. You can define custom resources (CRs) and custom resource definitions (CRDs) to tailor the behavior of the operator to your application’s needs. This flexibility allows operators to adapt to diverse application architectures and deployment scenarios.

Enhanced Monitoring and Insights — Operators offer deep insights into the health and performance of your application. They can expose metrics, generate alerts, and provide detailed analysis of workload behavior. This visibility allows for proactive monitoring, faster troubleshooting, and better optimization of resource usage.

Consistency and Reliability — By encapsulating operational knowledge into code, operators ensure consistent and reliable management of applications across different environments. They enforce best practices, ensure configurations are applied correctly, and maintain desired application states, reducing the risk of errors and improving overall reliability.

How Operators Work

Now that we understand why Operators are powerful, let’s explore how they work under the covers. Kubernetes Operators take advantage of two key primitives:

Custom Resource Definitions (CRDs) — CRDs in Kubernetes define the structure and behavior of custom resources. They serve as blueprints for introducing new resource types into a Kubernetes cluster. CRDs specify metadata, spec, and status fields for custom resources. Once defined, CRDs become part of the Kubernetes API, extending its capabilities. Users interact with custom resources using standard Kubernetes API operations.

Custom Resources — The primary way Operators extend the Kubernetes API is through Custom Resource Definitions (CRDs). These are first-class objects in the Kubernetes API that allows you to define entirely custom objects with their own schemas, fields, validation and more. You can then manage instances of these resources, called Custom Resources (CRs), just like you’d manage native resources like Pods or Deployments.

For example, you could define a CRD called PostgresCluster that encapsulates high-level details about a Postgres database cluster including backup schedules, storage settings, and replicas to spin up. You’d then create instances of this custom resource (CRs) to represent concrete Postgres clusters that map to the actual underlying resources deployed on the cluster.

Controllers — To actually implement the logic of an Operator, you write a controller — a specialized type of watch loop that continuously monitors the state of specific custom resources and reconciles actual state against desired state by creating, updating, or deleting other resources.

A Postgres Operator controller, for instance, might watch instances of the PostgresCluster custom resource and create/update deployments, services, storage volumes and more as needed to converge on the desired state described in that custom resource.

Operator Workflow

To understand how Custom Resource Definitions (CRDs) and Custom Resources (CRs) interact within Kubernetes and how they interact with the Kubernetes API, let’s break down the process we have to follow step by step:

1. Defining Custom Resource Definitions (CRDs):

  • CRDs are used to define custom resources in Kubernetes. They act as schemas that define the structure and behavior of custom resources within a Kubernetes cluster.
  • Developers create CRDs to specify the kind of custom resources they want to introduce into the Kubernetes API. This includes defining the resource’s metadata, spec, and status fields.
  • CRDs are defined using YAML or JSON manifests and can be applied to a Kubernetes cluster using the kubectl apply command or through Kubernetes configuration management tools like Helm.
Example of a CRD

2. Interacting with the Kubernetes API:

  • Once the CRD is applied to the Kubernetes cluster, it becomes part of the Kubernetes API and extends its capabilities.
  • The Kubernetes API server validates and stores the CRD definition, making it accessible to clients through the Kubernetes API.
  • Clients, such as administrators or automation tools, can interact with the Kubernetes API to perform operations on CRDs, such as creating, updating, deleting, or querying custom resources.

3. Creating Custom Resources (CRs):

  • Custom Resources are instances of the custom resource types defined by CRDs. They represent specific applications, configurations, or resources that need to be managed within the Kubernetes cluster.
  • Developers or administrators create CRs by providing values for the fields defined in the CRD spec. This allows them to customize the behavior and configuration of the custom resource.
  • CRs are created using YAML or JSON manifests, similar to CRDs, and can be applied to the Kubernetes cluster using kubectl apply or other configuration management tools.
Example Custom Resource

4. Interaction between CRDs, CRs, and Kubernetes API:

  • When a CR is applied to the Kubernetes cluster, the Kubernetes API server validates the resource against the corresponding CRD to ensure it conforms to the schema defined by the CRD.
  • If the CR passes validation, the Kubernetes API server stores the CR in the cluster’s etcd datastore, making it accessible to other components within the cluster.
  • Controllers or custom operators that are watching for changes to specific CRs (as defined by their CRD) are notified of the new or updated CR.
  • Controllers react to changes in CRs by reconciling the desired state specified in the CR with the actual state of the resources they manage. This may involve creating, updating, or deleting Kubernetes resources (such as Pods, Deployments, Services, etc.) based on the information provided in the CR.

5. Lifecycle Management and Interaction:

  • Throughout the lifecycle of a custom resource, Kubernetes controllers or custom operators continuously monitor the state of CRs and take appropriate actions to ensure that the desired state specified in the CR is maintained.
  • This interaction between CRDs, CRs, controllers, and the Kubernetes API forms the basis of how custom resources are managed and orchestrated within Kubernetes clusters, enabling the automation of complex application management tasks and providing a flexible framework for customizing Kubernetes behavior.

In summary, CRDs and CRs allow developers to extend the Kubernetes API with custom resources, enabling the management of complex applications and resources within Kubernetes clusters. Controllers or custom operators interact with CRs to reconcile their desired state with the actual state of resources, providing automation and customization capabilities for application management within Kubernetes.

Controllers leverage tools like the Kubernetes API libraries and custom resource definitions from the Operator SDK to make it easier to build robust controllers efficiently. With Go being a first-class language in the ecosystem, many Operators are built using Go and leverage the powerful controller-runtime libraries. We have implemented our Operator in Java which we will go in detail about in Part 2.

With these core constructs, Operators allow you to define and deploy entire applications through domain-specific custom resources. By interacting with these high-level representations, the controllers seamlessly deploy, configure, and actively manage the entire distributed application, simplifying operations substantially.

Contributors

Prabuddha Chakraborty Bobby Singh

--

--