I get a lot of questions from new service teams about what they should do to prevent downtime but very few people ask for advice on how to handle an incident. This is a bit like asking a boxer for the best way to avoid getting in the ring. It’s not a question of “if” you’re going to be in the ring but “when”. There’s an old saying – the more you bleed in the gym, the less you bleed in the ring and that definitely applies to incident management as well.
Having sat in on more war rooms than I’d like to remember, I thought it might be handy to write down some of the things that my team has found useful over the years. I think every service organization should have a standard approach towards three specific activities:
1. Tips for Handling Service Incidents (just one service)
2. Tips for Handling Service Outages (multiple services affected)
3. Tips for Handling System Maintenance
I hope these posts help you with your handling of incidents, outages, and maintenance. Success here is mostly about being prepared, being calm, good communication, and practice, practice, practice. If you think your service is bullet-proof and you won’t need the practice – you’re wrong :-)
Steve Fairfax 7x24 Exchange Keynote - realities of Small Modular Nuclear
reactors
-
Steve Fairfax presenting the Tuesday Oct 21 ,2025 keynote at 7x24 Exchange
Fall Conference. Steve presented an abundant amount of information from a
45 p...
1 day ago
No comments:
Post a Comment
Thoughts?