Also refer to https://timilearning.com/posts/mit-6.824/lecture-16-memcache-at-facebook/ and https://blog.bytebytego.com/p/how-facebook-served-billions-of-requests

image

Pre-memcache image

image

Adding memcache

image

image

What is memcache

memcached is a high-performance, distributed memory object caching system, generic in nature, but originally intended for use in speeding up dynamic web applications by alleviating database load.

image

image

Ref: https://memcached.org/about

Ref: https://engineering.fb.com/2013/04/15/core-infra/scaling-memcache-at-facebook/

image

image

image

image

How Facebook served billions of requests per second Using Memcached

image

Memcached is a well known, simple, inmemory caching solution. Memcached was originally developed by Brad Fitzpatrick for LiveJournal in 2003. It was originally written in Perl, but is rewritten in C by Anatoly Vorobey.

image

Ref: https://www.linuxjournal.com/article/7451

image

Facebook took up the open-source version of Memcached and enhanced it to build a distributed key-value store. This enhanced version was known as Memcache.

image

The following properties greatly influence their design.

Major design goals:

image

How requests are served

image

They choose to delete cached data instead of updating it because deletes are idempotent.

Consistent Hashing

Items are distributed across the memcached servers through consistent hashing.

Consistent Hashing is a technique that allows the distribution of a set of keys across multiple nodes in a way that minimizes the impact of node failures or additions.

For example, Each key is assigned to the node that falls closest to it in a clockwise direction.

image

Clients maintain a map of all available servers, which is updated through an auxiliary configuration system.

Client-server communication:

image

Reducing latency

At Facebook’s scale, a single web request can trigger hundreds of fetch requests to retrieve data from Memcached servers. Consider a scenario where a user loads a popular page containing numerous posts and comments.

image

Parallel requests and batching**:

Using UDP

Facebook employed a clever strategy to optimize network communication between the web servers and the Memcache server.

Problems with Caching

image

Leases:

How lease token solves this

image

image

All servers see a cache miss and everyone reaches out to database, increasing the load on the database.

image

Caches arbitrates access to the database:

image

Many memcache servers in one cluster

When you add more webservers, we would need more memcache servers

image

This would lead to every server communicating to every memcache server, all to all communication

image

One of the Problem is

When server wants some values and it does a wide parallel fetch. image

When the server returns the responses, we would see packet drops on the client side because of network congestion

image

Memcache clients implement flowcontrol mechanisms to limit incast congestion.

Multiple clusters

The all to all communication limits horizontal scalability.

image

Now we will need to keep the caches consistent

image

image

Inter-cluster bandwidth is less than Intra-cluster bandwidth. image

Geographically distributed clusters

image

Facebook is ok to write to master DB across because fb is read heavy system with 2 orders of higher magnitude

image

image

image

Memory allocation

image

image

image

image

One of the improvements Facebook made to memcached was moving to a smaller exponential so there is not as much waste in storing values in chunks. Instead of 2^n for the slab allocation, the latest versions of memcached use a much smaller growth exponential, 1.25^n, so you will see slabs with sizes 1KB, 1.25KB, 1.56KB, etc… This means that instead of 25% waste on average, you should see closer to 10%. Effectively you regain 15% of your memcached memory just by installing the latest version!