Cutting Our P99 Latency in Half with Connection Pooling
We were opening a fresh Postgres connection per request. Here is how a pooler took our P99 from 340ms to 160ms.
Author
Staff Software Engineer focused on databases and backend performance. Ex-platform team. Coffee-driven.
7 posts by Maya Chen.
We were opening a fresh Postgres connection per request. Here is how a pooler took our P99 from 340ms to 160ms.
Service-to-service JSON over HTTP was costing us in latency and schema drift. gRPC fixed two problems and created one.
Shared tables, schema-per-tenant, or database-per-tenant? We picked one on purpose. Here is the reasoning.
A migration without a tested down path is a one-way door. We treat reversibility as a requirement, not a nicety.
At-least-once delivery means your jobs will run twice. Design for it instead of hoping it won't happen.
Sharding bought us headroom and a pile of new problems. Pick your shard key like your job depends on it — because it does.
Reviews should catch bugs and spread knowledge without becoming a bottleneck. Here is the bar we hold.