Tags¶
arc-runners¶
argocd¶
awx¶
buildkit¶
calico¶
- Calico CNI Unauthorized — Stale or Expired calico-kubeconfig Token
- Post Incident Review: AWX Automation Pod Stuck Pending — Calico RBAC Gap + dqlite Write Storm
- Post Incident Review: dqlite Snapshot Bloat → kube-apiserver Instability → Controller Crash-Loop Cascade and Watch Stream Failure
- Post Incident Review: k8s01 Calico CNI Unauthorized — Stale Pod-Bound Token After Calico Upgrade
- Post Incident Review: k8s03 Extended Recovery — kine Watch Corruption, VXLAN Route Corruption, and Kubelet Watch Stream Stall
- Post Incident Review: k8s03 PLEG Deadlock — Stale Calico IPAM Blocks + Generic PLEG Serial-Poll Vulnerability
- k8s Upgrade Post-Upgrade Validation
cascade¶
cgroup-v2¶
cni¶
- Calico CNI Unauthorized — Stale or Expired calico-kubeconfig Token
- Post Incident Review: k8s03 PLEG Deadlock — Stale Calico IPAM Blocks + Generic PLEG Serial-Poll Vulnerability
containerd¶
- Post Incident Review: Radarr Outage — OpenEBS Jiva Replica Divergence (Second Occurrence)
- Post Incident Review: k8s01 Calico CNI Unauthorized — Stale Pod-Bound Token After Calico Upgrade
- Post Incident Review: k8s03 PLEG Deadlock — Stale Calico IPAM Blocks + Generic PLEG Serial-Poll Vulnerability
- Post Incident Review: microk8s 1.34 → 1.35 Rolling Upgrade — cgroup v2, containerd Shim, Disk Pressure, and Kubelet Stall
crash-loop¶
disk-pressure¶
- Post Incident Review: ARC GitHub Actions Runner Pods Stuck Pending — Kubelet Sync Loop Stall and Multi-Node Degradation
- Post Incident Review: Cascading Kubernetes Cluster Failures
- Post Incident Review: microk8s 1.34 → 1.35 Rolling Upgrade — cgroup v2, containerd Shim, Disk Pressure, and Kubelet Stall
- Post Incident Review: pvek8s Complete Cluster Outage — dqlite Quorum Loss and Ansible-Injected Invalid Flags
dqlite¶
- Post Incident Review: AWX Automation Pod Stuck Pending — Calico RBAC Gap + dqlite Write Storm
- Post Incident Review: Cascading Kubernetes Cluster Failures
- Post Incident Review: dqlite Snapshot Bloat → kube-apiserver Instability → Controller Crash-Loop Cascade and Watch Stream Failure
- Post Incident Review: k8s03 Extended Recovery — kine Watch Corruption, VXLAN Route Corruption, and Kubelet Watch Stream Stall
- Post Incident Review: k8s03 PLEG Deadlock — Stale Calico IPAM Blocks + Generic PLEG Serial-Poll Vulnerability
- Post Incident Review: microk8s 1.34 → 1.35 Rolling Upgrade — cgroup v2, containerd Shim, Disk Pressure, and Kubelet Stall
- Post Incident Review: pvek8s Complete Cluster Outage — dqlite Quorum Loss and Ansible-Injected Invalid Flags
- dqlite Write Contention
endpointslice¶
generic-pleg¶
hairpin-nat¶
ingress-nginx¶
ipam¶
iscsi¶
- Post Incident Review: Radarr Outage Due to OpenEBS Jiva Replica Divergence
- Post Incident Review: Radarr Outage — OpenEBS Jiva Replica Divergence (Second Occurrence)
- Post Incident Review: Sonarr Outage Due to iSCSI Hairpin NAT Failure on k8s03
k8s01¶
- Post Incident Review: Cascading Kubernetes Cluster Failures
- Post Incident Review: k8s01 Calico CNI Unauthorized — Stale Pod-Bound Token After Calico Upgrade
k8s02¶
k8s03¶
- Post Incident Review: Cascading Kubernetes Cluster Failures
- Post Incident Review: Sonarr Outage Due to iSCSI Hairpin NAT Failure on k8s03
- Post Incident Review: k8s03 Extended Recovery — kine Watch Corruption, VXLAN Route Corruption, and Kubelet Watch Stream Stall
- Post Incident Review: k8s03 PLEG Deadlock — Stale Calico IPAM Blocks + Generic PLEG Serial-Poll Vulnerability
- Post Incident Review: microk8s 1.34 → 1.35 Rolling Upgrade — cgroup v2, containerd Shim, Disk Pressure, and Kubelet Stall
kine¶
- Post Incident Review: k8s03 Extended Recovery — kine Watch Corruption, VXLAN Route Corruption, and Kubelet Watch Stream Stall
- Post Incident Review: k8s03 PLEG Deadlock — Stale Calico IPAM Blocks + Generic PLEG Serial-Poll Vulnerability
kube-apiserver¶
kubelet¶
- Post Incident Review: ARC GitHub Actions Runner Pods Stuck Pending — Kubelet Sync Loop Stall and Multi-Node Degradation
- Post Incident Review: Cascading Kubernetes Cluster Failures
- Post Incident Review: k8s01 Calico CNI Unauthorized — Stale Pod-Bound Token After Calico Upgrade
- Post Incident Review: k8s03 Extended Recovery — kine Watch Corruption, VXLAN Route Corruption, and Kubelet Watch Stream Stall
- Post Incident Review: microk8s 1.34 → 1.35 Rolling Upgrade — cgroup v2, containerd Shim, Disk Pressure, and Kubelet Stall
microk8s¶
- Jiva CSI Mount Proliferation
- Kubelet Silent Stall — Node Ready, Pods Never Schedule
- dqlite Write Contention
- k8s Upgrade Post-Upgrade Validation
network¶
networking¶
- Calico CNI Unauthorized — Stale or Expired calico-kubeconfig Token
- Post Incident Review: k8s01 Calico CNI Unauthorized — Stale Pod-Bound Token After Calico Upgrade
openebs¶
- Jiva CSI Mount Proliferation
- Post Incident Review: Cascading Kubernetes Cluster Failures
- Post Incident Review: Radarr Outage Due to OpenEBS Jiva Replica Divergence
- Post Incident Review: Radarr Outage — OpenEBS Jiva Replica Divergence (Second Occurrence)
- Post Incident Review: k8s01 Calico CNI Unauthorized — Stale Pod-Bound Token After Calico Upgrade
outage¶
pleg¶
pleg-stall¶
quorum-loss¶
radarr¶
- Post Incident Review: Radarr Outage Due to OpenEBS Jiva Replica Divergence
- Post Incident Review: Radarr Outage — OpenEBS Jiva Replica Divergence (Second Occurrence)
rbac¶
replica-divergence¶
- Post Incident Review: Radarr Outage Due to OpenEBS Jiva Replica Divergence
- Post Incident Review: Radarr Outage — OpenEBS Jiva Replica Divergence (Second Occurrence)
runbook¶
- Calico CNI Unauthorized — Stale or Expired calico-kubeconfig Token
- Jiva CSI Mount Proliferation
- Kubelet Silent Stall — Node Ready, Pods Never Schedule
- Runbooks
- dqlite Write Contention
- k8s Upgrade Post-Upgrade Validation
snapshot-bloat¶
sonarr¶
storage¶
- Jiva CSI Mount Proliferation
- Post Incident Review: Radarr Outage Due to OpenEBS Jiva Replica Divergence
- Post Incident Review: k8s01 Calico CNI Unauthorized — Stale Pod-Bound Token After Calico Upgrade
upgrade¶
- Post Incident Review: microk8s 1.34 → 1.35 Rolling Upgrade — cgroup v2, containerd Shim, Disk Pressure, and Kubelet Stall
- k8s Upgrade Post-Upgrade Validation
vxlan¶
watch-stream¶
- Post Incident Review: dqlite Snapshot Bloat → kube-apiserver Instability → Controller Crash-Loop Cascade and Watch Stream Failure
- Post Incident Review: k8s03 Extended Recovery — kine Watch Corruption, VXLAN Route Corruption, and Kubelet Watch Stream Stall