In the summer of 1940, Winston Churchill descended into a basement beneath Whitehall and declared it a war room. Maps covered the walls. Officers shouted updates. Decisions happened fast. Lives hung in the balance.

In Q3 of last year, your SRE team descended into a Google Meet and declared it a war room. Dashboards covered the screen. Engineers shouted updates. A decision was made to add more people to the call. Nothing hung in the balance except the career of whoever last touched the config.

One of these is a war room. The other is a very expensive group therapy session.

A Brief, Accurate History of the War Room

The term "war room" entered corporate vocabulary somewhere in the 1970s, when American executives — fresh off a decade of watching NASA pull off impossible things with men in short-sleeved shirts — decided their quarterly planning sessions deserved more gravitas. If NASA could put men on the moon from a room full of stressed people at consoles, surely McKinsey could handle a product launch the same way.

By the 1990s, the war room had migrated to politics (James Carville's famous Clinton campaign nerve center), then to product launches, then to tech. Silicon Valley adopted the aesthetic with particular enthusiasm. Nothing communicates urgency like gathering all your most expensive people in one room and giving it a military name.

Then came SRE. And the war room found its true calling.

Honest definition: A war room is a place where the people most likely to fix the incident are joined by the people most likely to prevent them from fixing the incident, all in real-time.

The Anatomy of a Modern SRE War Room

The contemporary incident response war room has evolved into a precise ritual with recognizable phases. SRE practitioners who have survived more than three P0s will recognize each one:

Phase 1: The Summons (Minutes 0–5)

PagerDuty fires. The on-call engineer, who was in the middle of either a meal, a shower, or REM sleep (always one of these three), acknowledges the alert. Within minutes, a Slack message announces a war room. Fourteen people join the Zoom before anyone knows what's wrong.

Phase 2: The Status Theater (Minutes 5–20)

Someone shares their screen. Everyone watches them navigate dashboards. Four people ask "can you make that bigger?" The on-call engineer narrates what they're seeing, like a nature documentarian describing a wildebeest migration. Hypotheses are offered freely. None are tested.

Phase 3: The Management Arrival (Minutes 15–25)

A director joins. Then a VP. The energy in the call shifts. The on-call engineer, who was previously troubleshooting with focus and efficiency, is now being asked to provide a status update every 90 seconds. The incident, which may have been self-resolving, is now definitely not self-resolving.

more people on the call than needed
73%
of comments are "what's the ETA?"
1
person actually fixing anything

Phase 4: The Blame Archaeology (Minutes 25–45)

Someone mentions a recent deploy. The git history is pulled up. The person who made that deploy is now the subject of nineteen stares (or their name appearing in the chat). They are invited to explain themselves. This is not technically part of incident response. It is happening anyway.

Phase 5: Resolution and Mythology

The incident resolves — sometimes because of the war room, more often despite it. Within minutes, the narrative begins to congeal. In the retelling, the war room will have been crucial. The fifteen observers will remember contributing meaningfully. The one engineer who fixed it will say "it was a team effort" and mean something complicated by that.

Why War Room Culture Persists in SRE

Here's the thing nobody says in the postmortem: war rooms feel good. They feel like doing something. In a domain defined by uncertainty and invisible systems, gathering humans in a room (or call) creates the sensation of control. The map on the wall — now a Grafana dashboard on a shared screen — implies mastery. The assembled faces imply capability.

The incident response war room also serves an important organizational function that has nothing to do with resolving incidents: it creates witnesses. When something goes wrong in a distributed system, there is always a question of who knew what and when. The war room answers this question with a 45-person attendance record.

This is not cynicism. This is archaeology. Understanding why war room culture evolved the way it did helps SRE teams design incident response processes that serve both functions — actual resolution and organizational accountability — without letting the second one destroy the first.

# war_room_participants.sh
echo "Engineers who can fix this: $(pagerduty get oncall | wc -l)"
echo "People currently on the Zoom: 47"
# ratio: 46:1 (observers to fixers)
# historical note: Churchill's actual war room had a better ratio

What Good Incident Response War Room Culture Actually Looks Like

The organizations that handle incidents well have figured out something counterintuitive: the war room should be small. Ruthlessly, deliberately small. The incident commander has one job — remove obstacles from the person fixing the thing. Everyone else has one job — stay off the call unless asked.

Good war room culture in SRE means separating the technical channel (where the fix happens) from the stakeholder channel (where people get updates). This isn't a new idea. Churchill did it. The people in the basement were not interrupted by MPs demanding status updates every two minutes. They had a separate briefing process.

The companies still conflating these two channels are optimizing for the appearance of incident response, not incident response itself. They've built elaborate processes around a misunderstanding of what a war room is actually for.

And the metric they're optimizing for — whether they know it or not — isn't MTTR. It's MTTB. Mean Time to Blame. Because in the absence of good incident response systems, the war room's primary output is a name.


If this resonates

Ciroos Actually Solves This

AI SRE teammates that handle incident response before it becomes a war room. Proactive reliability, not reactive blame theater. Your engineers stay asleep.

See How Ciroos Works → Calculate Your MTTB Score Laughed, then cried.