『Uptime Labs and the Multi-Party Dilemma (Part II)』のカバーアート

Uptime Labs and the Multi-Party Dilemma (Part II)

Uptime Labs and the Multi-Party Dilemma (Part II)

無料で聴く

ポッドキャストの詳細を見る

このコンテンツについて

Watch on YouTube

In Part II of the Multi-Party Dilemma (MPD) drill retrospective, we reconvene to dig deeper into the implications and nuances of the simulated incident exercise hosted on the Uptime Labs platform. Eric Dobbs (incident analyst), Alex Elman (deputy IC), and Sarah Butt (incident commander) continue their debrief with Courtney, reflecting on how team behavior evolved under stress, the importance of expertise in managing non-technical aspects of an incident like saturation, and how deeply held assumptions often go unspoken until tested under pressure.

This episode emphasizes the complex social and cognitive dimensions of incident response, such as how people coordinate, communicate, and construct shared understanding. It highlights the value of analyzing drills not for failure points, but for what they reveal about real work, adaptation, and human coordination.

Key Highlights

  • Incident Analysis as a Practice:
    • Eric Dobbs emphasized understanding how people make sense of unfolding events, rather than judging decisions in hindsight.
    • The goal is to study the “why it made sense at the time,” not what was “right” or “wrong.”
  • Drills Expose Hidden Assumptions:
    • Even experienced responders bring unspoken mental models into incidents.
    • The drill revealed assumptions about communication flows, authority boundaries, and vendor interactions that were not made explicit in planning.
  • The Value of Human Expertise:
    • Everyone involved in this incident brought an unparalleled level of expertise to the work.
    • Often this kind of expertise goes unnoticed or is taken for granted, however this kind of knowledge is precisely what makes for smoother, better coordinated (and sometimes), faster incident response.
  • Importance of Framing:
    • The way questions are asked in retrospectives can shape what is revealed—e.g., “What made that hard?” is more productive than “What did you miss?”
    • Reframing incidents around constraints and tradeoffs leads to deeper insight.
  • Team Learning and Culture:
    • Safe, high-trust environments enable better learning during drills.
    • Psychological safety allows team members to admit confusion or raise alternate interpretations during real incidents.

Resources and References

  • Episode I
  • Model of Overload/Saturation as part of the Theory of Graceful Extensibility
  • Lorin's Law
まだレビューはありません