Kubernetes (K8s) is an open-source container orchestration platform originally developed by Google and donated to the CNCF in 2014. It automates the deployment, scaling, and management of containerized workloads across a cluster of machines.
Before K8s, running containers at scale meant writing custom scripts to restart crashed containers, balance load, and roll out updates. Kubernetes solves all of that declaratively — you describe what you want, and the control plane makes it happen.
| Concept | Analogy | What it does |
|---|---|---|
Pod | A shipping container | Smallest deployable unit; wraps one or more containers |
Deployment | A fleet manager | Manages replicas, rolling updates, rollbacks |
Service | A DNS name + load balancer | Stable network endpoint in front of pods |
Namespace | A department | Logical isolation boundary within a cluster |
Node | A server | Physical/virtual machine that runs pods |
Cluster | A data center | The full set of control-plane + worker nodes |
Docker builds and runs containers on a single host. Kubernetes orchestrates containers across many hosts. They are complementary: Docker (or containerd/CRI-O) provides the container runtime, and Kubernetes uses it to schedule and manage workloads cluster-wide.
A Kubernetes cluster has two logical planes: the Control Plane (formerly "master") which makes global decisions about the cluster, and Worker Nodes which run your application workloads.
kubectl commands, controllers, and nodes communicate through the REST API it exposes. Validates and persists objects to etcd.Service virtual IPs and load balancing across pod endpoints.Every controller in K8s runs a watch → diff → act loop:
You have several options for a local Kubernetes lab. We'll use k3s — a lightweight, CNCF-certified K8s distro ideal for learning. It runs on a single Linux VM or bare metal host.
| Tool | Best For | Overhead | Notes |
|---|---|---|---|
| k3s | Linux VM / bare metal lab | Low (~512MB RAM) | Single binary, full K8s API |
| minikube | Mac/Windows dev laptop | Medium | VM or Docker driver |
| kind | CI pipelines, Docker hosts | Low | Nodes as Docker containers |
| k3d | k3s inside Docker | Low | Multi-node on one host |
# As root or with sudo curl -sfL https://get.k3s.io | sh - # Verify the service is running systemctl status k3s
/etc/rancher/k3s/k3s.yaml. Copy it to your user's config directory.mkdir -p ~/.kube sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config sudo chown $(id -u):$(id -g) ~/.kube/config # Or use k3s's built-in kubectl alias sudo k3s kubectl get nodes
kubectl get nodes # NAME STATUS ROLES AGE VERSION # k3s-node Ready control-plane,master 1m v1.30.x kubectl get pods -n kube-system
kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl echo 'alias k=kubectl' >> ~/.bashrc echo 'complete -o default -F __start_kubectl k' >> ~/.bashrc source ~/.bashrc
k3d cluster create lab --servers 1 --agents 2 spins up a 3-node cluster (1 control-plane + 2 workers) as Docker containers in under 30 seconds.A Pod is the smallest deployable unit in Kubernetes — a wrapper around one or more containers that share a network namespace (same IP, same localhost) and storage volumes. In practice, most pods run a single container.
Pods are ephemeral: they are created, run, and die. They do not reschedule themselves. That's what Deployments are for.
# Create a pod imperatively (useful for quick debugging) kubectl run nginx-pod --image=nginx:alpine --port=80 # Watch it come up kubectl get pod nginx-pod -w # Inspect it kubectl describe pod nginx-pod # Get shell access kubectl exec -it nginx-pod -- sh
The declarative approach — what you'll use in production:
apiVersion: v1 kind: Pod metadata: name: my-nginx labels: app: nginx env: dev spec: containers: - name: nginx image: nginx:alpine ports: - containerPort: 80 resources: requests: cpu: "100m" # 0.1 CPU cores memory: "64Mi" limits: cpu: "250m" memory: "128Mi" readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 5 periodSeconds: 10 livenessProbe: httpGet: path: / port: 80 failureThreshold: 3
A Deployment manages a ReplicaSet, which manages pods. It's the primary way to run stateless applications. Deployments handle rolling updates, rollbacks, and scaling — all with zero-downtime by default.
apiVersion: apps/v1 kind: Deployment metadata: name: web-app namespace: default spec: replicas: 3 selector: matchLabels: app: web-app # must match template labels strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # extra pods during update maxUnavailable: 0 # zero downtime template: metadata: labels: app: web-app spec: containers: - name: web image: nginx:1.25 ports: - containerPort: 80 resources: requests: { cpu: "100m", memory: "64Mi" } limits: { cpu: "500m", memory: "256Mi" }
# Trigger a rolling update (new image) kubectl set image deployment/web-app web=nginx:1.26 # Watch the rollout kubectl rollout status deployment/web-app # Inspect revision history kubectl rollout history deployment/web-app # Roll back to previous revision kubectl rollout undo deployment/web-app # Roll back to specific revision kubectl rollout undo deployment/web-app --to-revision=2 # Scale manually kubectl scale deployment/web-app --replicas=5
# Imperative kubectl autoscale deployment/web-app --cpu-percent=70 --min=2 --max=10 # Declarative (HPA v2) apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: web-app minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70
Pods are ephemeral — their IPs change on every restart. A Service provides a stable virtual IP (ClusterIP) and DNS name that load-balances traffic to the pods matching its selector.
svc.namespace.svc.cluster.localapiVersion: v1 kind: Service metadata: name: web-svc spec: selector: app: web-app # matches Deployment pod labels ports: - port: 80 # Service port (ClusterIP) targetPort: 80 # container port protocol: TCP type: ClusterIP
An Ingress routes HTTP/HTTPS traffic by hostname or path to backend services — like a reverse proxy. You need an Ingress Controller installed (nginx-ingress, Traefik, etc.) for Ingress resources to work.
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: web-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: ingressClassName: nginx rules: - host: app.example.com http: paths: - path: / pathType: Prefix backend: service: name: web-svc port: { number: 80 }
<svc>.<ns>.svc.cluster.local. Within the same namespace, just <svc> works. This is served by CoreDNS running in kube-system.Kubernetes separates configuration from container images. ConfigMaps store non-sensitive configuration (env vars, config files). Secrets store sensitive data (passwords, tokens, TLS certs) — base64-encoded at rest by default, though you should enable encryption at rest in production.
apiVersion: v1 kind: ConfigMap metadata: name: app-config data: APP_ENV: "production" LOG_LEVEL: "info" app.conf: | server.port=8080 db.pool.size=10
apiVersion: v1 kind: Secret metadata: name: db-secret type: Opaque data: # values must be base64-encoded: echo -n 'mypassword' | base64 DB_PASSWORD: bXlwYXNzd29yZA== DB_USER: YWRtaW4= stringData: # alternative: plain text (K8s encodes it) API_KEY: "my-api-key-plain"
spec: containers: - name: app image: myapp:v1 env: - name: LOG_LEVEL # single key from ConfigMap valueFrom: configMapKeyRef: name: app-config key: LOG_LEVEL - name: DB_PASSWORD # single key from Secret valueFrom: secretKeyRef: name: db-secret key: DB_PASSWORD envFrom: # ALL keys as env vars - configMapRef: name: app-config volumeMounts: - name: config-vol mountPath: /etc/app # mount as files volumes: - name: config-vol configMap: name: app-config
Container filesystems are ephemeral. Volumes persist data. The storage workflow in K8s has three layers: StorageClass (how to provision), PersistentVolume (the actual storage), and PersistentVolumeClaim (a pod's request for storage).
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: data-pvc spec: accessModes: - ReadWriteOnce # RWO: one node; RWX: many nodes (NFS) storageClassName: standard resources: requests: storage: 10Gi --- # Mount in a pod spec: containers: - name: db image: postgres:16 volumeMounts: - name: pgdata mountPath: /var/lib/postgresql/data volumes: - name: pgdata persistentVolumeClaim: claimName: data-pvc
| Mode | Short | Meaning |
|---|---|---|
| ReadWriteOnce | RWO | One node can mount read/write. Most block storage (EBS, local disk). |
| ReadOnlyMany | ROX | Many nodes can mount read-only. |
| ReadWriteMany | RWX | Many nodes can mount read/write. Requires shared storage: NFS, CephFS, Azure Files. |
| ReadWriteOncePod | RWOP | Only one pod cluster-wide. Strictest isolation (K8s 1.22+). |
Role-Based Access Control (RBAC) governs who can do what to which resources. It is the primary authorization mechanism in K8s and should be enabled on every cluster.
| Object | Scope | Purpose |
|---|---|---|
Role | Namespace | Grants permissions within one namespace |
ClusterRole | Cluster | Grants permissions cluster-wide or for non-namespaced resources (nodes, PVs) |
RoleBinding | Namespace | Binds a Role (or ClusterRole) to a user/group/serviceaccount in a namespace |
ClusterRoleBinding | Cluster | Binds a ClusterRole to a subject cluster-wide |
# Role: read-only access to pods in 'dev' namespace apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: pod-reader namespace: dev rules: - apiGroups: [""] # "" = core API group resources: ["pods", "pods/log"] verbs: ["get", "list", "watch"] --- kind: RoleBinding metadata: name: read-pods namespace: dev subjects: - kind: User name: alice apiGroup: rbac.authorization.k8s.io - kind: ServiceAccount name: ci-runner namespace: dev roleRef: kind: Role name: pod-reader apiGroup: rbac.authorization.k8s.io
# Can alice delete pods in dev? kubectl auth can-i delete pods -n dev --as=alice # no # Can ci-runner serviceaccount list pods? kubectl auth can-i list pods -n dev --as=system:serviceaccount:dev:ci-runner # yes # What can I do? (verbose) kubectl auth can-i --list -n dev
When something breaks in K8s, there's a logical triage path. Start broad (cluster, nodes), narrow to workload (deployment, replicaset, pods), then drill into the container itself (logs, exec, events).
kubectl get pods -n mynamespace # Look for: Pending, CrashLoopBackOff, ImagePullBackOff, OOMKilled, Error
describe output are the most useful signal.kubectl describe pod mypod -n mynamespace # Check: Events section, resource limits, node assignments, image name
kubectl logs mypod kubectl logs --previous mypod # crashed container kubectl logs -f mypod # follow live kubectl logs -l app=myapp --all-containers # all pods by label
kubectl exec -it mypod -- bash # or sh if bash not available: kubectl exec -it mypod -- sh
# Ephemeral debug pod (deleted on exit) kubectl run netdbg --image=nicolaka/netshoot -it --rm -- bash # From inside: test DNS and connectivity nslookup web-svc.default.svc.cluster.local curl http://web-svc/healthz
| Status | Cause | Fix |
|---|---|---|
Pending | No node with sufficient resources, or node selector mismatch | describe pod → check Events; check node capacity |
ImagePullBackOff | Wrong image name/tag, or missing registry credentials | Check image name; create imagePullSecret |
CrashLoopBackOff | Container exits repeatedly | logs --previous; fix app crash or wrong command |
OOMKilled | Container exceeded memory limit | Increase resources.limits.memory |
Terminating (stuck) | Finalizers blocking deletion | kubectl delete pod --force --grace-period=0 |
Evicted | Node disk/memory pressure | Free node resources; check eviction thresholds |
# Requires metrics-server installed kubectl top nodes kubectl top pods -A --sort-by=memory # Events across all resources (sorted) kubectl get events -A --sort-by=.lastTimestamp # Check API server health kubectl get componentstatuses # deprecated but sometimes useful kubectl get --raw=/healthz