Table of contents
Kubernetes is an open source Container Management tool / Orchestration tool which automates container deployment, container scaling and load balancing. It schedules, runs and manages isolated containers which are running on Virtual / Physical / Cloud machines.
History -
Google developed an internal system called borg to deploy and manage thousands of google applications and services on their cluster. (cluster - Group of containers). (Internal system - Used within the company)
In 2014, google introduced Kubernetes as open source platform written in golang and later donated to CNCF (Cloud Native Computing Foundation).
Why Kubernetes was adopted worldwide?
1. Trend from Monolithic to Microservices.
2. Increased usage of containers.
3. Demand for a proper way of managing hundreds of containers.
Kubernetes Architecture -
Control Plane
1. Kube API Server - It interacts directly with the user through commands / yaml / json file. It scales automatically as per the requests coming.
- It does Authentication, Authorization, admission (Policy Checking). User is authenticated with headers passed. Authorization is done using RBAC (Role based access control). It uses web hooks to validate. Policy checks are done via admission controllers.
2. Etcd - Database which stores data in key - value pair. It stores metadata and status of cluster.
- Its Fast and secure (implements automated TLS with optional client - certification authentication).
3. Kube scheduler - Handles Pod creation and management. Its the one taking action for all the requests incoming. It finds the best fit node based on taints and tolerations, affinity, Nodeselector.
- If destination node is not specified for the new pod to be created, then kube scheduler intelligently finds the node in which the new Pod will be created. It gets all the information about worker nodes(where it will create the Pod) from Etcd through API server.
4. Control Manager - It ensures that actual state of cluster and desired state of cluster are same. If we're working completely on cloud then we'll have cloud manager there. If we're working on - premises then, kube control manager is there.
- Control Manager has ->
Node Controller - Responsible for noticing and responding when nodes go down.
Replication Controller - Responsible for maintaining the correct number of pods.
Endpoints Controller - Populates the Endpoint object so that it joins services and pods.
Service Account & Token Controllers - Create default accounts and API access token for new namespaces.
5. Cloud Controller Manager (CCM) - It's responsible for talking to the cloud (if we've taken any services from the cloud), and the Kubernetes cluster to perform some actions.
Worker Node
1. Kublet - Agent running on the node. It listens to Master node.
- Responsibilities -> Communicate with the Master node, and Send reports of successful / unsuccessful report of Pod creation.
2. Kube Proxy - It manages network connection between Pods.
- Responsibilities - Assign IP address to each Pod, When a new Pod IP comes, Kubeproxy does mapping between virtual IP and Pod IP.
3. Container engine - It starts and stops containers. It exposes container on Ports specified on manifest (yaml file).
Manifest (Yaml file)
Here, we write the desired things. It will have - apiversion, Kind (Defines what O/P we need), Metadata(Gives names & attaches labels to Pods), spec(specifications about the kind).
Kubernetes Objects -
Kubernetes uses objects to represent the state of your cluster -> which containerized applications are running and in which state. Objects are identified by Unique Name or UID.
Pod
It is the basic logical unit of Kubernetes. It has an IP address. (Container doesn't have IP address, Pod has IP address). In Kubernetes control unit is Pod, not containers. Inside Pod, Container is there, it is recommended to have only 1 container inside a Pod. Because if the Pod fails, all container inside that also fails.
- When do we create 2 or more containers inside a single Pod?
-> When 1 container is dependent on another.
Service
When a Pod crashes, a new Pod is created automatically using Replicaset, with a new IP address. Now how can Pods communicate when IP address is changing?
-> Services - It gives a virtual IP address. Services objects act as a Bridge between Pods and end-user.
By-default service can run only between Ports 30000 - 32767.
Volume
If a container inside a Pod fails, it's data should never be lost. For this, we use volume. When a new container is created, it will get all the old data with the help of volume.
If the Pod crashes then volume inside that Pod gets deleted.
Namespace
It makes our environments isolated (each team should have their own quotas and policies).
Namespaces are used for grouping resources separately like monitoring, databases, etc so its easy to check / view.
Four Namespaces -
Kube - system, Kube - public, Kube - node - lease, default.
Config Maps & Secrets
Applications require specific configuration files to run. When the environment changes, these configuration files should remain same. We need these files outside the Pod, but inside the cluster. We can map the container with the configuration file by calling these files inside yaml files (manifest).
Secrets is It's used for storing sensitive data.
2 ways to access ConfigMap & Secrets
1. As environment variables.
2. As volume in the Pod.
Deployment
Used for creating a replica set to keep a track of the desired state.
Imperative vs Declarative -
Imperative -> Simple / Compound Commands, Recommended environment -> Development projects.
Declarative -> Individual Files (yaml / json). Recommended environment -> Production.
Lables & Selectors
Lables - It is a way to organize Kubernetes objects. Its a Key-Value pair attached to objects. Multiple lablels can be added to a single object.
Selectors - Its a way to find labels.
2 types of selectors
Equity based (Key-Value) & Set based (in, notin, exists). Node Selector is used to select the node in which we want to create the Pod.
Minikube -
In this Master and Worker component is installed on a single node. Used for learning and testing purposes.
Pod Lifecycle -
1. Pending - Finding the node / If you've created a node which has storage assigned to it, then waiting for Persistent volume to be ready and PVC to bound to it.
2. Container creating - Pulling image, starting it, attaching network.
3. Running
4. Error
5. Crash loop back off - Process deployed too many times. (Out of memory)
6. Succeeded
Persistent Volume
Its Always available, even after Pod crashes. It's a cluster wide resource - it is connected to all nodes in the cluster. It uses NFS.
Termination grace period
Whenever you delete a Pod, there's a termination grace period of a particular resource.
When you put the delete command, may be there's a request being processed at that same time, so you want that request to be processed atleast for a time limit, and no further requests to be taken till deletion happens.
init container
It runs before the main container. Can Contain custom code that is not present in the application.
It can change the filesystem based on certain login before the main container. It can do pre - condition checks. You can have multiple init containers, they run in a sequential order.
Probs
Problem statement -> What if the Pod is not ready to handle the traffic? What if there's a deadlock situation and Kubernetes will still be sending the traffic?
Solution -> Probs
To check things and make sure that everything is correct we have 3 Probs -
1. readiness -
Check if the Pod is ready to accept traffic, check dependencies for pod in terms of availability of service / latency / issue check.
2. startup -
First this is executed, when its successful, other probes are executed.
3. liveness -
httpGet - If the response is ok or not.
topSocket - Port check
exec - custom command like file to check if Pod is ready or not.
Kubernetes Networking
Every Node has a eth0 which is a communication entrypoint - From there all traffic enters.
Container within a Pod communicates via localhost. (Container doesn't have a IP address)
Two Pods inside a Node can communicate as well, because they have a IP address.
Features of Kubernetes -
Orchestration of any number of containers, running in different networks which means containers can run on virtual machines / on-premise machines / cloud.
Auto-scaling - Both Vertical and Horizontal. Vertical scaling - Adding resources with more capacity, Horizontal scaling - Adding resources with the same capacity.
Load - Balancing
Platform - independent (Cloud / Virtual / Physical machine)
Fault - tolerance
Roll - back (Going back to previous versions)
Health monitoring of containers.
Batch - Execution (one time / sequential)
High availability.
Disaster recovery - backup and restore.
Scalability / High performance.
Kubernetes vs Docker Swarm
Kubernetes can work with almost all container building tools like Docker, Rocket, Container d, whereas docker swarm works only with Docker.
In Kubernetes, GUI is available, whereas in docker swarm GUI is not available.
Kubernetes supports auto - scaling and docker swarm does not support it.
Kubernetes has an inbuilt tool for monitoring whereas docker swarm uses 3rd party tools like splink.