Running an online service can be challenging especially with bigger team sizes. Over the last decade, I had the pleasure to work with amazing people on large scale online services to learn what it takes to successfully operate them.
For many, operating services mean firefighting. But with the following simple but powerful tricks, it is possible to turn from a re-active to pro-active approach.
- Test software before it goes in production
- Understand changes between old configuration with the new configuration
- Have a working rollback plan in place before starting
- Know at least two verification methods to confirm a successful deployment
- If you do multiple changes, do them one at the time and step-by-step
- Do changes in low usage periods
- Automate as many steps as possible
These simple points make it possible to achieve high uptime and reduce the firefighting.
You have questions about this post? Would you like me to write more about a similar topic? You think something is missing in this post? Leave a comment or E-Mail me at Blog@Steven-Geller.com