Incidents¶
Post-incident reviews documenting what went wrong, why, and how we fixed it.
| Date | Title | Severity | Duration |
|---|---|---|---|
| 2026-03-30 | Sonarr Outage — iSCSI Hairpin NAT Failure (ContainerCreating) | High | ~45m |
| 2026-03-28 | Radarr Outage — OpenEBS Jiva Replica Divergence (Second Occurrence) | High | ~30h |
| 2026-03-28 | ARC GitHub Actions Runner Pods Stuck Pending — Kubelet Sync Loop Stall | High | ~7h40m |
| 2026-02-22 | Radarr Outage Due to OpenEBS Jiva Replica Divergence | High | ~17h |
| 2026-01-06 | Cascading Kubernetes Cluster Failures | Critical | ~3 days |
- Post Incident Review: Sonarr Outage Due to iSCSI Hairpin NAT Failure on k8s03 — 2026-03-30
- Post Incident Review: Radarr Outage — OpenEBS Jiva Replica Divergence (Second Occurrence) — 2026-03-28
- Post Incident Review: ARC GitHub Actions Runner Pods Stuck Pending — Kubelet Sync Loop Stall and Multi-Node Degradation — 2026-03-28
- Post Incident Review: Radarr Outage Due to OpenEBS Jiva Replica Divergence — 2026-02-22
- Post Incident Review: Cascading Kubernetes Cluster Failures — 2026-01-06