Don't Just Fix It, Learn From It - The Importance of Incident Management when Something Breaks
1:55 - 2:55 Great Hall 1 & 2
Panic messages saying the system is having issues. Your phone buzzing from your alert system sending you text's about the system being down. Intuition kicks in and tells you solve this issue as fast as possible and get back to your day. While you have solved your issue at hand, you're not setting yourself up for future success and preventing doing the same thing next time around. In this session, we will discuss the importance of not just solving the issue at hand but how to learn and improve your processes. We'll review topics such as documenting as the outcomes as it is occurring, the importance of playbooks, and leading a successful post mortem to make sure this isn't a fix and forget situation. We'll go thru a mock incident to see how we can incorporate each of these and other processes throughout to ensure that we learn from our mistakes to prevent similar scenarios from happening in the future. While getting your system usable for your end users should be goal number 1, the very next goal is not falling into a similar state in the future. Putting this process in place, you will have the tools in your belt to prevent this from happening again.
Rick Clymer
Rick is currently a QA lead for RocketReach, working on developing a processes and tools that provides the highest quality product and data for our customers.. He started his career in QA 9 years ago with just a hunch of where he would end up. He has found a true passion in test automation and ensuring the product that his customers are using has as much quality backed in as possible. His beliefs and processes have shifted over the years away from owning the entire process to advocating the entire team to bake quality into the product from the beginning.