Quickstart
First run
Learn the incident loop, not just the command syntax.
The fastest way to understand kubediag is to run it against one broken workload, inspect the top-ranked finding, and follow the next commands it suggests. That loop is the product: symptom to evidence to action without manual archaeology.
Run against one pod first
A single-pod diagnosis makes the ranking model and evidence format obvious before you move into deployment or namespace scope.
Use severity and confidence as the entrypoint
The first finding should tell you what kubediag believes is driving the incident and how certain it is.
Paste the suggested next commands
kubediag shortens the gap between diagnosis and confirmation by embedding the most useful kubectl follow-ups directly in the output.
Pod diagnosis
Start with the smallest useful scope.
For a broken workload, the pod view is the quickest way to see how kubediag structures a diagnosis.
kubediag pod my-pod -n default
If the problem is rollout- or service-shaped rather than pod-shaped, move up to deployment or namespace scope after this first run.
▶ Pod default/my-api-7f9b-xk2m2 Phase: Running Ready: 0/1 ⓧ CRITICAL [high confidence] TRG-POD-CRASHLOOPBACKOFF Container "api" is in CrashLoopBackOff (5 restarts in the last 3m) Evidence: • pod.status.containerStatuses[0].lastState.terminated.reason = "Error" • pod.status.containerStatuses[0].lastState.terminated.exitCode = 1 • Event (Warning, BackOff, 30s ago): "Back-off restarting failed container" Next commands: $ kubectl logs -n default my-api-7f9b-xk2m2 -c api --previous $ kubectl describe pod -n default my-api-7f9b-xk2m2
Expand scope when the incident shape changes
Deployment scope
Use deployment diagnosis when the failure is about rollout progress.
This view combines deployment-level findings and the pod-level signals underneath them.
kubediag deployment web -n prod
Namespace and cluster scope
Use wider scopes for fleet-wide health signals.
Namespace and cluster modes surface warning events, service issues, and node pressure patterns that are easy to miss when starting from one pod.
kubediag namespace prod kubediag cluster
Switch renderer based on audience
Machine-readable
Choose the renderer that matches the next consumer.
Use JSON for automation, markdown for reports, and terminal text for live incident response.
kubediag pod my-pod -o json kubediag namespace prod -o markdown kubediag report namespace prod > triage-report.md
Inspect and explain rules
Rule introspection
Use stable rule IDs as a reference surface.
Rules are public identifiers, which makes them useful in alerts, runbooks, and postmortems.
kubediag rules list kubediag rules explain TRG-POD-CRASHLOOPBACKOFF
Keep the feedback loop tight
- Run
triageon the narrowest scope that still contains the incident. - Read the highest-ranked finding and the evidence it cites.
- Paste the suggested next commands to confirm or falsify the diagnosis.
- Apply the fix and rerun the same command to see whether the finding clears.
If you want a guided browser version of that flow, use the interactive sandbox.