Understanding Kubernetes Architecture & Objects
Before starting with what is Kubernetes ? Lets first know why we need Kubernetes and what all alternative options we have.
When we deploy multiple containers of our application, its very tedious job to manage them on production like environments. Even if we want to run and manage multiple containers on a single server, it won’t be easy. There are several scenarios to take care, like making our infra highly available, scalable, secured and most important, easily manageable.
Lets look into what were pain areas of container running in stand-alone and what is motivation behind Orchestration.
Pain Areas of running containers in Stand-alone
- On demand auto scaling is not available
- Mounting storage systems, resource monitoring, service discovery is not easy
- No Load balancing, application health checking and instance replication
- Downtime while releases
- No Identity and authorization
Orchestration
Container orchestration is an automation of the operational efforts which is required for managing, scheduling the container and its services. It makes easy to configure, deploy, run, route traffic to multiple instances with same endpoint and many more.
With orchestration, we achieve :
- Load balancing and better resource utilization
- Faster deployment of application, rolling updates with zero downtime
- Simplifies configurations and portability of application package for multi environment
- More than just scheduling, ease of use for scaling and service replication
- Application health checking
There are multiple orchestration tools like Kubernetes, Docker Swarm, Apache Mesosphere, AWS ECS etc. Kubernetes stands on top of all options due to its cloud agnostic and many more robust features. Lets discuss about Kubernetes in details.
Kubernetes
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. Kubernetes can be deployed and managed in all leading cloud platforms like AWS, GCP, Azure etc. It can also be deployed on on-premise and hybrid infrastructure. It is one of the most trending technology in cloud computing today as of today. It supports hosting enhanced and complex applications on various kinds of architectures.
Kubernetes hosts our applications in the form of containers in an automated fashion so that we can easily deploy as many instance of our application. We can enable communication between services within our application.
- Greek for “pilot” or “helmsman of a ship” aka k8s
- It is an open source cluster manager from Google
- It is written in Go
- It can manage containers, not machines
- It helps us maintain Current state to desired state of containers count for highly availability.
There are many things involved that work together to make this possible. Lets take a high level look of Kubernetes architecture.
Kubernetes Architecture
Master Node
The Master node typically consist of etcd, API Server, Scheduler and Controller Manager.
ETCD: It is a database that stores information in a key value format. Its a distributed, watchable, consistent database, Open source by COREOS.
It stores information about the nodes, pods, config, secrets, accounts, roles etc. Every information we see from kubectl get command is from the ETCD server. All the changes of cluster like new node addition, pod scheduling etc are stored in ETCD.
Kube API Server: Kube API Server is a front end to the control plane. It Exposes REST API and Consumes JSON,YAML (manifest files).
It is the primary management component in Kubernetes. When we run a kubectl command, the kubectl utility is actually reaching to the kube API Server. The API Server first authenticate the request and validates it, it then retrieves the data from ETCD server and response back with requested information. All other components from master and node communicate to each other via API Server only. The kube API Server is responsible for user authentication, validating request, retrieve data, update ETCD, scheduling and Kubelet.
Kube Scheduler: Kube scheduler watches Kube API Server for new pod, and then assigns work to worker node. The scheduler is only responsible for deciding which pod will go on which node, depending on the different criteria like required resources, replicas of pod, affinity rules etc, kubelet is the one which creates pod inside nodes.
Kube Controller Manager: It watches for changes and helps maintain current state of cluster to desired functioning state. It also helps to remediate the situations. It manages various controllers in Kubernetes like node controllers, replication controller, deployment controller, namespace controller, endpoint controller etc . All these intelligence features are built in k8s through these controllers.
Worker Node
K8s worker node typically consist of Kubelet, Kube-proxy and container engine.
Kubelet: Its is the main Kubernetes agent. They are the sole point of contact from the k8s master. It registers node with cluster and watches api server to instantiate pods and reports back to Master. It monitors the nodes and pods.
Kube Proxy: Kube proxy is a process that runs on each node in k8s cluster. Its pod networking using IP table rules. It manages Pod IP addresses. All containers in a Pod share single IP Address. It monitors for new services in cluster and it load balances across all pods in a service.
Container Engine: It does container Management i.e. pulling images, starting and stopping images etc. We can use different container engines like Docker, Rocket etc.
Overview:
Overview of the the k8s high level process :
Sample Specification
Below example is of a specification file which can be referred to start with objects. Its a yaml definition file.
apiVersion: It refers to the version of Kubernetes. There are several versions, and several objects are introduced with each version. Some common ones are v1, apps/v1.
Kind: This is the place where we mention type of K8s object. In this sample, we’re creating a deployment.
Metadata: The metadata block keeps the information that describes the object briefly. It contains the name we want to give the object (deployment in our case), the labels, and the annotation. We can define as many labels as we want, and we aren’t restricted to words to use as labels.
Spec: In the spec section, we define the desired state of our object.
In this section, many terms were new for us, lets discuss about them in detail.
Objects
PODs: Kubernetes does not deploy containers directly on worked nodes. The containers are encapsulated inside an k8s object called PODs. Pod is a single instance of an application. Its one of the smallest object we create in k8s.
- Pod is a container or group of containers that are deployed together on the same host
- Pods are always co-located and co-scheduled, and run in a shared context
- Pod can have one or more containers, best practice is to deploy single container in a pod
- We can generally replace the word “pod” with “container”
- It shares IP and localhost, every pod gets an unique IP
- It shares same volume
ReplicaSets: Replicaset is an advanced option of replication controller. Replicasets are supported in apps/v1 and later versions. It uses matchLabels specified under selector option.
- Role of replica set is to monitor the pods, and if anyone of them are failed, redeploy them.
- It is used when we want to run multiple pods of our application at a time using any objects.
- Replicaset can also manage pods which were not created as part of replica set creation (uses labels and selectors)
Labels and Selectors: For managing 100s & 1000s of pods or production like environments, replicaset uses Labels and selectors to maintain the desired state.
- Key-value pairs that are attached to objects
- With label selector, the client/user can identify a set of objects
- The label selector is the core grouping primitive in Kubernetes
- It can be attached to objects at creation time and can be added and modified at any time.
Deployment : Deployment provides us with the capability to upgrade the underlying instance of our application using multiple strategies like rolling upgrade.
- Deployment consists of pod template, count, label selector
- K8s will try to keep desired count of pods matching the label selector running.
- We can rollback to an earlier deployment revision if the current deployment is not stable.
- We can pause and resume a deployment
Service: K8s services help us communication between various components within and outside of an applications. It works like an internal load balancer for routing traffic to multiple pods of different nodes. Service help pods communicating with other group of pods like ui-api-database three tier architecture applications.
There are different types of services like NodePort, ClusterIP (default) and LoadBalancer. Each of these have different use cases, though ClusterIP is most popular among them.
- Service is a set of running pods accessible by virtual IP
- It works like an internal Load balancer for multiple pods
Namespaces: Namespaces is logical partitioning for separating groups of applications to isolate them with required resources. We can divide the cluster resources (CPU, RAM, Storage, Network In/Out etc)for different types of applications, so that one group does not effect functioning of other group on high volume of traffic. K8s creates three namespaces by default : kube-system, default and kube-public.
- Mechanism to partition cluster into logically named group, eg: Dev, QA, Test, Production
- With Namespaces, it will not allow to use more resources than its allowed (defined for a namespace)
- Namespace can be used in environments with many users spread across multiple teams, or projects
- Provides scope for named resources (to avoid naming collisions)
- It enables us provide authority to trusted users
This was all about the Kubernetes architectures and its objects. There is lot to learn on K8s and its functionality. Sharing the link for reference.
Thanks for reading this article. Please share your thought and comment below :)
Reference: