Caching: The magicians layer in software development


    Handling data in the system is one of the most critical things while designing a software system. Making sure that the data flow from the user to the persistence layer and back to the user when needed is seamless and as fast as possible is critical for almost every system. Caching is a vital concept in this regard. In computing, a cache is a high-speed data storage layer that stores a subset of data, typically transient in nature so that future requests for that data are served up faster than is possible by accessing the primary storage location. Usually, this is possible because of the way cache stores data in high-speed accessible hardware like RAM and in-house in the server. More details on how caches work can be found here. 

    For our article, we will not need exactly how a cache works but at a high level, we would need to know that it stores data in-house and in a high-speed accessible memory and that it has limitations and can't store everything like a database or so. For ease of understanding, out of various types of caches, we will continue with Redis which is a key-value pair-based cache that stores data as a key-value pair in memory. Now, to understand the magic of a cache layer, let us try to design a simple URL shorter system. The system would do the following :

  1. Take a long URL and return a shortenend URL.
  2. Whenever someone passes the shortened URL, it redirects to the longer URL
As this system is just for demonstration purposes, we would assume a few things which would make things very easy for us like :
  1. No collision of URLs will happen and we already have a mechanism to convert a long URL into a tiny URL.
  2. Our system would be read-heavy, that is, we will have more requests to redirect to a longer URL than to shorten the URL.
Based on these assumptions, we can come up with a very basic design for such a system (with a lot of assumptions and abstractions) as follows :
Figure 1: Very high-level view of the URL shorter and retrieval mechanism
    
    The above system looks good at a high level as long as we do not consider this case if say a particular shortened URL points to a longer URL which is very frequently used. In that case, we would end up fetching the longer URL a lot of time. We can most certainly enhance the experience if we know that a certain URL is used a lot, maybe we can store it somewhere from where it is easy to access it in the fastest possible manner. This is where the magic of cache comes into the picture. Let's modify the above design to retrieve the longer URL a bit by introducing a cache layer in between :
Figure 2: Very high-level view of the URL shorter and retrieval mechanism with a cache

    With this new flow, now what you can do is each time a new URL is shortened, you can keep it in a cache. For our assumption, let's keep it in Redis with key = shortened URL and value = original URL. Now, when someone tries to access the original link via the shortened link while retrieving, the system can then first look for the key based on the shortened URL and if found, can simply return the value for that key and redirect to that. If not found, it can then fall back to the database which are manifolds slower. 
    Now, this is a very high-level view. There are a lot of things that we need to consider like :
  1. How long will we keep a particular key-value pair in the cache? The cache memory is limited and very expensive. Mostly we have a TTL (time to leave) which denotes how long to keep it in the cache before deleting it (we assume that a URL is used extensively only for a certain time post creation. Post that we can rely on the database).
  2. How will we rotate the key-value pair? Say a particular URL has started to be used a lot recently after a long time. How will we get it back in the cache without compromising the cost and other pairs in the cache?
  3. What will we store in the cache? Do we only need to keep the key as a short URL and value as a long URL? Or do we need some more metadata to be kept so that maybe we can have some metrics emitted which will help us optimize our system better.
  4. How will this design scale in the case of large-scale distributed systems?
    There are a lot more questions that need to be answered but if we can answer them all well in our detailed system design, this little cache layer can improve the performance of our system and turnaround time for our customer manifolds. This is also magical in terms that this layer can be made a lot more modular with recent cloud-based caches like AWS elastic cache etc. This gives us more flexibility and provides us with off-the-shelf caching solutions. A good design, in most cases, uses this magical layer to best serve the customers. We must be careful though that it does not backfire in terms of turn-around time by using a lot of async calls than keeping everything in sync. Hence, before designing a system using this cache layer, we must know in details of the whole layer internally. This link will help with more details. For now, let me put down my pen and let the readers explore and improvise!

Amrit Raj

Comments