
Ankit Mehta
Founder, Vigiles
Ankit Mehta has spent more than twenty years building and running cloud infrastructure, working across DevOps and enterprise architecture for systems that could not afford to go down quietly. He founded Vigiles to build the incident management platform he wanted as an engineer, one that covers the whole lifecycle from the first failed check to the final postmortem.
He is an AWS Community Builder and an Alibaba Cloud MVP, and he is FinOps certified. In 2026 he spoke at AWS Summit Singapore on agentic incident response, covering how AI agents can search, read, and act during an incident.
He writes here about monitoring, incident response, on-call, and the day-to-day reality of keeping production reliable. The posts come from doing the work rather than from a content calendar, which is why most of them start with something that actually went wrong.
Posts by Ankit Mehta
- The fishbone diagram for incident root cause analysisJune 16, 2026
- After an incident, MTTR is the wrong thing to brag aboutJune 14, 2026
- Alert fatigue starts with the alert you should not have sentJune 12, 2026
- How to communicate during an outage without making it worseJune 10, 2026
- The restart reflex that wrecks a security investigationJune 8, 2026
- A security incident is not just another outageJune 8, 2026
- Your first on-call shift, and how to get through itJune 6, 2026
- Your incident response has a bus factor of oneJune 4, 2026
- The engineer who knows everything is a risk, not a heroJune 4, 2026
- What a status page is actually forMay 29, 2026
- Runbooks people actually useMay 25, 2026
- Severity levels stop working when everything is a P1May 23, 2026
- The on-call handoff is where incidents get droppedMay 21, 2026
- Blameless does not mean consequence-freeMay 17, 2026
- The incident report I never wanted to write againFebruary 18, 2026