The best option here is A: Shard the search database horizontally.
Why Each Looks Tempting at First
- B (Vertical scaling): Bigger box = instant performance win, no code changes.
- C (Caching): Redis = sub-millisecond responses for popular queries.
- D (Microservice): Separation = dedicated scaling for search, cleaner boundaries.
- A (Sharding): Splits data across nodes, spreads the load.
The Failure Modes
- B: You’ll quickly hit the ceiling of the biggest available server. Costs rise sharply.
- C: Caching helps for repeated queries, but most search workloads are long-tail. The cache misses hurt.
- D: Moving to a microservice solves team ownership, not scaling. Under the hood, it still needs distributed data.
Why Horizontal Scaling Wins:
- Distributes queries across multiple nodes → no single bottleneck.
- Scales linearly with traffic growth.
- Handles unpredictable spikes better than vertical scaling.
- Proven design: Elasticsearch, Solr, and most large search systems rely on sharding.
Trade-Offs to Keep in Mind
- Rebalancing shards is complex and can be expensive.
- Poor partitioning = hotspots that kill performance.
- More nodes = more monitoring, automation, and alerting overhead.
- Query routing logic adds latency if not optimized.
Sharding isn’t the easy choice. But it’s the only choice that scales search reliably past the first traffic spike.