devops deploys reliability

Our Journey to Zero-Downtime Deploys

Rolling deploys, health checks that actually check health, and the database migration rule that saved us.

Raj Patel, Tom BeckerApril 27, 2026 · 1 min read

Share

"Zero downtime" is mostly about ordering. Ship code that tolerates both the old and new database shape, then migrate.

The expand/contract rule

Expand: add the new column, nullable. Deploy code that writes both.
Migrate: backfill.
Contract: stop writing the old column, then drop it — in a later deploy.

// During expand, read prefers new but falls back to old
const name = row.full_name ?? row.name;

Health checks

Our old /health returned 200 if the process was up. It lied. The new one checks the database and the queue before reporting ready, so the load balancer never routes to a pod that can't serve.

A health check that always returns 200 is just an uptime decoration.

Share

More to read

Related posts

Postmortem: The Day Our Cache StampededMay 16, 2026 · 1 min read

Writing Migrations You Can Roll BackMay 4, 2026 · 1 min read

Debugging a Memory Leak in Production Node.jsMarch 30, 2026 · 1 min read