Kubernetes Best Practices
Table of Contents
1. Cluster Architecture
Design your cluster with high availability in mind. Use multiple master nodes and spread worker nodes across zones or regions.
# Example: Creating a multi‑zone cluster on GKE
gcloud container clusters create prod-cluster \
--zone us-central1-a \
--node-locations us-central1-b,us-central1-c \
--num-nodes 3 \
--enable-ip-alias
2. Security
Apply the principle of least privilege. Use PodSecurityPolicies or the newer PodSecurity Standards, and enforce RBAC.
# Example: Restrict privileged containers
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted-psp
spec:
privileged: false
runAsUser:
rule: MustRunAsNonRoot
seLinux:
rule: RunAsAny
3. Networking
Leverage network policies to isolate workloads. Prefer CNI plugins that support policy enforcement, such as Calico.
# Example: Simple deny‑all policy with an allow exception
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend
spec:
podSelector:
matchLabels:
app: frontend
ingress:
- from:
- podSelector:
matchLabels:
app: backend
4. Observability
Instrument your workloads with Prometheus metrics and use Loki for logs. Set up alerts for critical SLOs.
# Example: PrometheusRule to alert high CPU usage
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: cpu-alert
spec:
groups:
- name: cpu.rules
rules:
- alert: HighCpuUsage
expr: sum(rate(container_cpu_usage_seconds_total[5m])) by (pod) > 0.9
for: 2m
labels:
severity: warning
annotations:
summary: "CPU usage high for pod {{ $labels.pod }}"
5. CI/CD Integration
Use GitOps tools like Argo CD or Flux to keep your clusters declarative.
# Example: Argo CD Application manifest
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
spec:
project: default
source:
repoURL: https://github.com/example/my-app.git
targetRevision: HEAD
path: manifests
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
6. Resource Management
Define resource requests and limits for every container to avoid node exhaustion.
# Example: Setting requests and limits
resources:
requests:
cpu: "250m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"