Roadmap
v0.1.0 — First release ✅
- Core rule engine with 23 rules covering crash/runtime, scheduling, image pull, networking, rollout, and RBAC
- Three output formats: text (ANSI), JSON, Markdown
- Standalone binary (
triage) and kubectl plugin (kubectl-kubediag) kubediag pod,kubediag deployment,kubediag namespace,kubediag clustercommandskubediag rules listandkubediag rules explain <id>for rule introspection- Request-scoped ResourceCache to avoid redundant Kubernetes API calls
- GoReleaser distribution (linux/darwin/windows, amd64/arm64)
- Krew manifest for kubectl plugin installation
v0.2.0 — Rule set expansion ✅
- TRG-POD-EXIT-IMMEDIATE — container exits immediately: exec format error, missing binary, wrong architecture (exit 126/127)
- TRG-SVC-PORT-MISMATCH — service
targetPortnot exposed by selected pod'scontainerPorts - TRG-POD-BAD-ENV-REF —
configMapKeyRef/secretKeyRefpointing at a missing key in an existing ConfigMap or Secret - TRG-CLUSTER-QUOTA-EXHAUSTED — namespace ResourceQuota at ≥95% (High) or 100% (Critical)
- TRG-CLUSTER-APISERVER-LATENCY — Warning events indicating API server / etcd latency
kubediag report clusterfully implemented (was a stub)- Markdown reports now include a table of contents with anchor links
- TRG-POD-READINESS-FAILING improvement: samples up to 3 distinct recent event messages per container
v0.3.0 — YAML rule packs (external rules)
- Rule pack format: YAML-defined rules with CEL expressions for field matching
kubediag rules load ./my-rules.yamlfor custom/org-specific rules- Rule versioning and conflict resolution
- Rule pack repository (community-contributed rules)
v0.4.0 — Interactive mode and watch
kubediag watch pod <name>— re-run every N seconds, diff findings--sinceflag: filter events newer than a duration- Interactive selector for
kubediag namespace(fzf-style picker)
v1.0.0 — Stable API
- Rule ID and finding schema stability guarantee
pkg/promotion for embedding kubediag as a library- Deprecation policy enforcement (old IDs kept as aliases for one minor)
- Comprehensive e2e test suite against kind clusters
- Homebrew tap and container image
Out of scope (v1)
- Mutating operations (kubediag is read-only)
- Operator/CRD-specific rules (Crossplane, Istio, cert-manager) — v2 roadmap
- LLM-powered explanation — architecture leaves a pluggable explainer hook
- Log aggregation / indexing
- Persistent state or alerting