Kubernetes Tutorial

Module 01 · Foundations

What Is Kubernetes?

Beginner

~15 min

Kubernetes (K8s) is an open-source container orchestration platform originally developed by Google and donated to the CNCF in 2014. It automates the deployment, scaling, and management of containerized workloads across a cluster of machines.

Before K8s, running containers at scale meant writing custom scripts to restart crashed containers, balance load, and roll out updates. Kubernetes solves all of that declaratively — you describe what you want, and the control plane makes it happen.

The Problem Kubernetes Solves

Without K8s

Crashed containers stay dead. Rolling updates require custom scripts. Scaling is manual. Load balancing is external glue. Config drift is constant.

With K8s

Self-healing restarts failed pods. Rolling/canary deployments built-in. Horizontal autoscaling on metrics. Internal DNS and load balancing included.

Declarative Model

You write YAML manifests describing desired state. K8s continuously reconciles actual state toward desired state — the control loop.

Portable

Runs on-prem (bare metal, VMs), any cloud (EKS, GKE, AKS), or a laptop (k3s, minikube). Same YAML everywhere.

Key Concepts at a Glance

Concept	Analogy	What it does
`Pod`	A shipping container	Smallest deployable unit; wraps one or more containers
`Deployment`	A fleet manager	Manages replicas, rolling updates, rollbacks
`Service`	A DNS name + load balancer	Stable network endpoint in front of pods
`Namespace`	A department	Logical isolation boundary within a cluster
`Node`	A server	Physical/virtual machine that runs pods
`Cluster`	A data center	The full set of control-plane + worker nodes

Kubernetes vs. Docker

Docker builds and runs containers on a single host. Kubernetes orchestrates containers across many hosts. They are complementary: Docker (or containerd/CRI-O) provides the container runtime, and Kubernetes uses it to schedule and manage workloads cluster-wide.

💡

Note: Since K8s 1.24, Docker is no longer the default runtime. Clusters use containerd or CRI-O directly via the Container Runtime Interface (CRI).

⚡ Quick Check

Which best describes Kubernetes' operational model?

A

Imperative — you tell it every step to execute

B

Declarative — you describe desired state; K8s reconciles

C

Event-driven — you subscribe to container lifecycle hooks

D

Scripted — you provide shell scripts for each operation

Module 02 · Foundations

Cluster Architecture

Beginner

~20 min

A Kubernetes cluster has two logical planes: the Control Plane (formerly "master") which makes global decisions about the cluster, and Worker Nodes which run your application workloads.

Cluster Architecture Overview

Control Plane

API Server

kube-apiserver

etcd

cluster state store

Scheduler

kube-scheduler

Controller Mgr

control loops

↕ HTTPS

Worker Nodes

kubelet

node agent

kube-proxy

network rules

Container RT

containerd/CRI-O

Pod · Pod · Pod

workloads

Control Plane Components

kube-apiserver

The front door. All kubectl commands, controllers, and nodes communicate through the REST API it exposes. Validates and persists objects to etcd.

etcd

Distributed key-value store. The single source of truth for all cluster state. Back this up. If etcd dies, the cluster loses memory.

kube-scheduler

Watches for unscheduled pods and assigns them to nodes based on resource requests, taints/tolerations, affinity rules, and available capacity.

kube-controller-manager

Runs control loops: Node controller, Deployment controller, ReplicaSet controller, Job controller, etc. Each loop reconciles actual → desired state.

Worker Node Components

kubelet

Agent on every node. Registers the node with the API server, pulls pod specs, instructs the container runtime to start/stop containers, and reports back node/pod status.

kube-proxy

Maintains iptables/IPVS rules on each node that implement Service virtual IPs and load balancing across pod endpoints.

Container Runtime

The software that actually runs containers. containerd and CRI-O are the standard choices. kubelet talks to them via the CRI gRPC interface.

The Reconciliation Loop

Every controller in K8s runs a watch → diff → act loop:

Desired State

YAML in etcd

→

Controller

watches API

→

Diff

desired vs actual

→

Reconcile

create/delete/update

⚠️

Exam tip (CKA): The API server is the only component that talks directly to etcd. Everything else (scheduler, controllers, kubelet) communicates exclusively through the API server.

⚡ Quick Check

Which component is responsible for assigning a newly created pod to a node?

A

kube-controller-manager

B

kubelet

C

kube-scheduler

D

kube-proxy

Module 03 · Foundations

Lab Setup

Hands-On

~25 min

You have several options for a local Kubernetes lab. We'll use k3s — a lightweight, CNCF-certified K8s distro ideal for learning. It runs on a single Linux VM or bare metal host.

Option Comparison

Tool	Best For	Overhead	Notes
k3s	Linux VM / bare metal lab	Low (~512MB RAM)	Single binary, full K8s API
minikube	Mac/Windows dev laptop	Medium	VM or Docker driver
kind	CI pipelines, Docker hosts	Low	Nodes as Docker containers
k3d	k3s inside Docker	Low	Multi-node on one host

Install k3s (Single-Node)

1

Install k3s

Run the official install script. It will set up the server and a systemd service automatically.

bash

# As root or with sudo
curl -sfL https://get.k3s.io | sh -

# Verify the service is running
systemctl status k3s

2

Configure kubectl access

k3s writes its kubeconfig to /etc/rancher/k3s/k3s.yaml. Copy it to your user's config directory.

bash

mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown $(id -u):$(id -g) ~/.kube/config

# Or use k3s's built-in kubectl alias
sudo k3s kubectl get nodes

3

Verify the cluster

Confirm your node is Ready and system pods are running.

bash

kubectl get nodes
# NAME         STATUS   ROLES                  AGE   VERSION
# k3s-node     Ready    control-plane,master   1m    v1.30.x

kubectl get pods -n kube-system

4

Install kubectl bash completion (optional but useful)

bash

kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl
echo 'alias k=kubectl' >> ~/.bashrc
echo 'complete -o default -F __start_kubectl k' >> ~/.bashrc
source ~/.bashrc

💡

Multi-node with k3d: k3d cluster create lab --servers 1 --agents 2 spins up a 3-node cluster (1 control-plane + 2 workers) as Docker containers in under 30 seconds.

Module 04 · Core Resources

Working With Pods

Hands-On

~25 min

A Pod is the smallest deployable unit in Kubernetes — a wrapper around one or more containers that share a network namespace (same IP, same localhost) and storage volumes. In practice, most pods run a single container.

Pods are ephemeral: they are created, run, and die. They do not reschedule themselves. That's what Deployments are for.

Your First Pod

imperativebash

# Create a pod imperatively (useful for quick debugging)
kubectl run nginx-pod --image=nginx:alpine --port=80

# Watch it come up
kubectl get pod nginx-pod -w

# Inspect it
kubectl describe pod nginx-pod

# Get shell access
kubectl exec -it nginx-pod -- sh

Pod Manifest (YAML)

The declarative approach — what you'll use in production:

pod.yamlyaml

apiVersion: v1
kind: Pod
metadata:
  name: my-nginx
  labels:
    app: nginx
    env: dev
spec:
  containers:
  - name: nginx
    image: nginx:alpine
    ports:
    - containerPort: 80
    resources:
      requests:
        cpu: "100m"      # 0.1 CPU cores
        memory: "64Mi"
      limits:
        cpu: "250m"
        memory: "128Mi"
    readinessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 10
    livenessProbe:
      httpGet:
        path: /
        port: 80
      failureThreshold: 3

Health Probes

readinessProbe

Gates traffic. Pod won't receive Service traffic until probe passes. Use this to prevent requests hitting a container that hasn't finished starting up.

livenessProbe

Detects deadlocks. If it fails, kubelet restarts the container. Use for apps that can get stuck in a broken state without crashing.

startupProbe

For slow-starting apps. Disables liveness/readiness until it passes. Prevents premature restarts on slow JVM or DB init.

🚨

Don't run naked pods in production. A pod deleted or node lost is gone forever — nothing recreates it. Always use a Deployment (or StatefulSet/DaemonSet) to manage pod lifecycle.

⚡ Quick Check

Two containers in the same Pod want to communicate. What address should they use?

A

The pod's cluster IP address

B

localhost (they share a network namespace)

C

The node's IP address

D

A Kubernetes Service ClusterIP

Module 05 · Core Resources

Deployments & Rollouts

Hands-On

~30 min

A Deployment manages a ReplicaSet, which manages pods. It's the primary way to run stateless applications. Deployments handle rolling updates, rollbacks, and scaling — all with zero-downtime by default.

Deployment Manifest

deployment.yamlyaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app           # must match template labels
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # extra pods during update
      maxUnavailable: 0  # zero downtime
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
      - name: web
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests: { cpu: "100m", memory: "64Mi" }
          limits:   { cpu: "500m", memory: "256Mi" }

Rolling Update Workflow

bash

# Trigger a rolling update (new image)
kubectl set image deployment/web-app web=nginx:1.26

# Watch the rollout
kubectl rollout status deployment/web-app

# Inspect revision history
kubectl rollout history deployment/web-app

# Roll back to previous revision
kubectl rollout undo deployment/web-app

# Roll back to specific revision
kubectl rollout undo deployment/web-app --to-revision=2

# Scale manually
kubectl scale deployment/web-app --replicas=5

Horizontal Pod Autoscaler

bash + yaml

# Imperative
kubectl autoscale deployment/web-app --cpu-percent=70 --min=2 --max=10

# Declarative (HPA v2)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Module 06 · Core Resources

Services & Networking

Hands-On

~30 min

Pods are ephemeral — their IPs change on every restart. A Service provides a stable virtual IP (ClusterIP) and DNS name that load-balances traffic to the pods matching its selector.

Service Types

ClusterIP (default)

Accessible only within the cluster. Used for internal service-to-service communication. Every service gets a DNS entry: svc.namespace.svc.cluster.local

NodePort

Exposes a static port (30000–32767) on every node's IP. Useful for dev/testing. Not recommended for production — use a LoadBalancer or Ingress instead.

LoadBalancer

Provisions a cloud provider load balancer (AWS ELB, GCP LB, etc.) with a public IP. The standard way to expose services externally on managed K8s.

ExternalName

Maps a service to a DNS name outside the cluster (CNAME). Useful for migrating external services into the cluster namespace gradually.

Service + Deployment Example

service.yamlyaml

apiVersion: v1
kind: Service
metadata:
  name: web-svc
spec:
  selector:
    app: web-app     # matches Deployment pod labels
  ports:
  - port: 80         # Service port (ClusterIP)
    targetPort: 80  # container port
    protocol: TCP
  type: ClusterIP

Ingress

An Ingress routes HTTP/HTTPS traffic by hostname or path to backend services — like a reverse proxy. You need an Ingress Controller installed (nginx-ingress, Traefik, etc.) for Ingress resources to work.

ingress.yamlyaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-svc
            port: { number: 80 }

💡

DNS: Within a cluster, services are reachable at <svc>.<ns>.svc.cluster.local. Within the same namespace, just <svc> works. This is served by CoreDNS running in kube-system.

Module 07 · Configuration

ConfigMaps & Secrets

Hands-On

~25 min

Kubernetes separates configuration from container images. ConfigMaps store non-sensitive configuration (env vars, config files). Secrets store sensitive data (passwords, tokens, TLS certs) — base64-encoded at rest by default, though you should enable encryption at rest in production.

ConfigMap

configmap.yamlyaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  APP_ENV: "production"
  LOG_LEVEL: "info"
  app.conf: |
    server.port=8080
    db.pool.size=10

Secret

secret.yamlyaml

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
    # values must be base64-encoded: echo -n 'mypassword' | base64
  DB_PASSWORD: bXlwYXNzd29yZA==
  DB_USER: YWRtaW4=
stringData:          # alternative: plain text (K8s encodes it)
  API_KEY: "my-api-key-plain"

Consuming in a Pod

pod consuming configyaml

spec:
  containers:
  - name: app
    image: myapp:v1
    env:
    - name: LOG_LEVEL             # single key from ConfigMap
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: LOG_LEVEL
    - name: DB_PASSWORD           # single key from Secret
      valueFrom:
        secretKeyRef:
          name: db-secret
          key: DB_PASSWORD
    envFrom:                       # ALL keys as env vars
    - configMapRef:
        name: app-config
    volumeMounts:
    - name: config-vol
      mountPath: /etc/app        # mount as files
  volumes:
  - name: config-vol
    configMap:
      name: app-config

⚠️

Secrets are not encrypted by default — only base64-encoded in etcd. For production, enable Encryption at Rest in the API server config and consider external secret managers like HashiCorp Vault, AWS Secrets Manager, or the External Secrets Operator.

Module 08 · Configuration

Persistent Storage

Intermediate

~25 min

Container filesystems are ephemeral. Volumes persist data. The storage workflow in K8s has three layers: StorageClass (how to provision), PersistentVolume (the actual storage), and PersistentVolumeClaim (a pod's request for storage).

Storage Binding Flow

StorageClass

provisioner + params

→

PersistentVolume

actual storage resource

↔

PVC

claim (size + accessMode)

→

Pod

mounts the volume

PersistentVolumeClaim

pvc.yamlyaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  accessModes:
  - ReadWriteOnce      # RWO: one node; RWX: many nodes (NFS)
  storageClassName: standard
  resources:
    requests:
      storage: 10Gi

---  # Mount in a pod
spec:
  containers:
  - name: db
    image: postgres:16
    volumeMounts:
    - name: pgdata
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: pgdata
    persistentVolumeClaim:
      claimName: data-pvc

Access Modes

Mode	Short	Meaning
ReadWriteOnce	RWO	One node can mount read/write. Most block storage (EBS, local disk).
ReadOnlyMany	ROX	Many nodes can mount read-only.
ReadWriteMany	RWX	Many nodes can mount read/write. Requires shared storage: NFS, CephFS, Azure Files.
ReadWriteOncePod	RWOP	Only one pod cluster-wide. Strictest isolation (K8s 1.22+).

Module 09 · Operations

RBAC & Security

Intermediate

~30 min

Role-Based Access Control (RBAC) governs who can do what to which resources. It is the primary authorization mechanism in K8s and should be enabled on every cluster.

RBAC Objects

Object	Scope	Purpose
`Role`	Namespace	Grants permissions within one namespace
`ClusterRole`	Cluster	Grants permissions cluster-wide or for non-namespaced resources (nodes, PVs)
`RoleBinding`	Namespace	Binds a Role (or ClusterRole) to a user/group/serviceaccount in a namespace
`ClusterRoleBinding`	Cluster	Binds a ClusterRole to a subject cluster-wide

Role + Binding Example

rbac.yamlyaml

# Role: read-only access to pods in 'dev' namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: dev
rules:
- apiGroups: [""]            # "" = core API group
  resources: ["pods", "pods/log"]
  verbs: ["get", "list", "watch"]
---
kind: RoleBinding
metadata:
  name: read-pods
  namespace: dev
subjects:
- kind: User
  name: alice
  apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
  name: ci-runner
  namespace: dev
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Verifying Permissions

bash

# Can alice delete pods in dev?
kubectl auth can-i delete pods -n dev --as=alice
# no

# Can ci-runner serviceaccount list pods?
kubectl auth can-i list pods -n dev --as=system:serviceaccount:dev:ci-runner
# yes

# What can I do? (verbose)
kubectl auth can-i --list -n dev

💡

Principle of Least Privilege: Default service accounts in most namespaces have minimal permissions. Always create dedicated ServiceAccounts for workloads that need API access, and bind only what they need.

Module 10 · Operations

Observability & Debugging

Intermediate

~30 min

When something breaks in K8s, there's a logical triage path. Start broad (cluster, nodes), narrow to workload (deployment, replicaset, pods), then drill into the container itself (logs, exec, events).

Triage Workflow

1

Check Pod Status

Start by listing pods and looking for non-Running status.

bash

kubectl get pods -n mynamespace
# Look for: Pending, CrashLoopBackOff, ImagePullBackOff, OOMKilled, Error

2

Describe the Pod

Events at the bottom of describe output are the most useful signal.

bash

kubectl describe pod mypod -n mynamespace
# Check: Events section, resource limits, node assignments, image name

3

Read Logs

Current logs, plus the previous container run if it crashed.

bash

kubectl logs mypod
kubectl logs --previous mypod     # crashed container
kubectl logs -f mypod             # follow live
kubectl logs -l app=myapp --all-containers  # all pods by label

4

Exec into the Container

For live inspection when logs aren't enough.

bash

kubectl exec -it mypod -- bash
# or sh if bash not available:
kubectl exec -it mypod -- sh

5

Deploy a Debug Pod

For network troubleshooting — run a temporary pod with networking tools.

bash

# Ephemeral debug pod (deleted on exit)
kubectl run netdbg --image=nicolaka/netshoot -it --rm -- bash

# From inside: test DNS and connectivity
nslookup web-svc.default.svc.cluster.local
curl http://web-svc/healthz

Common Pod Failure States

Status	Cause	Fix
`Pending`	No node with sufficient resources, or node selector mismatch	`describe pod` → check Events; check node capacity
`ImagePullBackOff`	Wrong image name/tag, or missing registry credentials	Check image name; create imagePullSecret
`CrashLoopBackOff`	Container exits repeatedly	`logs --previous`; fix app crash or wrong command
`OOMKilled`	Container exceeded memory limit	Increase `resources.limits.memory`
`Terminating` (stuck)	Finalizers blocking deletion	`kubectl delete pod --force --grace-period=0`
`Evicted`	Node disk/memory pressure	Free node resources; check eviction thresholds

Resource Monitoring

bash

# Requires metrics-server installed
kubectl top nodes
kubectl top pods -A --sort-by=memory

# Events across all resources (sorted)
kubectl get events -A --sort-by=.lastTimestamp

# Check API server health
kubectl get componentstatuses  # deprecated but sometimes useful
kubectl get --raw=/healthz

💡

For production observability, deploy the kube-prometheus-stack Helm chart (Prometheus + Grafana + Alertmanager + node-exporter + kube-state-metrics). It gives you GPU, node, and workload dashboards out of the box — familiar territory if you're running Prometheus/Grafana already.

⚡ Final Check

A pod is in CrashLoopBackOff. What's the best first step?

A

Delete and recreate the deployment

B

Drain the node it's running on

C

Run kubectl logs --previous to read the crash output

D

Increase the memory limit