Sudhir Kumar Tiwari

Back to Portfolio

Designing Distributed Systems at Scale

June 2024  |  8 min read  |  Distributed Systems Architecture Cloud

Building distributed systems that can handle millions of requests per second while maintaining high availability is one of the most challenging problems in software engineering. Over the past 8 years, working at companies like Oracle Cloud Infrastructure and Walmart Global Tech, I've had the opportunity to design and operate such systems firsthand.

The Core Challenge

At its heart, distributed systems must deal with three fundamental constraints — known as the CAP theorem — which states that any distributed data store can only provide two of the following three guarantees simultaneously:

Lessons from OCI Load Balancer

While working on the Oracle Cloud Infrastructure Load Balancer team, we maintained a system that distributed web requests across fleets of servers across multiple fault domains and availability domains. Here are key design decisions that made it resilient:

Data Consistency Patterns

At Walmart's Order Management System, we handled Canada market orders where eventual consistency was acceptable for some read operations, but order state transitions required strong consistency. We used a combination of:

Observability is Non-Negotiable

You cannot operate a distributed system without comprehensive observability. At OCI, I owned the creation of Grafana dashboards, alarms, and runbooks. The three pillars of observability — metrics, logs, and traces — are essential. I'd add a fourth: runbooks. When an alarm fires at 2am, you want step-by-step guidance available immediately.

Conclusion

Designing distributed systems is as much an art as it is a science. There's no one-size-fits-all solution. Understanding the specific requirements of your system — the SLAs, the consistency requirements, the scale — is the foundation of any good distributed system design. Start simple, measure everything, and evolve the architecture as you learn from production.


Have thoughts or questions? Reach out at tiwarisudhir059@gmail.com or connect on LinkedIn.