Zerodha (Real time stock trading app) system design: An attempt!

 


        Many retail investors recently have access to the various global stock markets thanks to real-time trading apps. It might look like one of the usual applications but it has silently led to a revolution in the financial sector. Today, let's try and build a similar system for ourselves! An actual stock trading application has several features but for our use case, let us scope out what our application must perform for it to be called a minimum lovable product. Let us write down the functional requirements of the application : 
  1. The user should be able to log in to his account.
  2. The user should be able to see the data related to various listed companies.
  3. The user should be able to see metrics and dashboards at various granular levels.
  4. The user should be able to create a wishlist of stocks.
  5. The user should be able to buy and sell stocks and make payments towards them through a wallet.
  6. The user should be able to see his portfolio and statistics around it.
  7. The operations need to be real-time.
    Apart from these functional requirements, we would also need to have many non-functional requirements which include the following : 
  1. The system should be horizontally scalable and should be able to handle a huge amount of traffic.
  2. The system should be highly available.
  3. The system has some components which need to be strongly consistent while a few others may be eventually consistent. 
    With these requirements in mind, let's start working towards the unknowns and try to freeze some actors for the first draft of the system.
  1. Client: The user of the app is an individual who wishes to invest in the stock market or follow its trends. The Client would need to sign and have an account with the application with relevant document verifications. 
  2. Client bank: The bank which the user uses to make and receive payments from the wallet. We can assume that the bank exposes APIs using which we can transfer the amount from the bank account to our wallets and deposit money from the wallet back to the bank.
  3. Stock repository: Any stocks bought or sold are kept in the digital format in the central repository managed by the government. We may assume that the repository provides APIs to add/remove stock documents from it based on client consent in the form of an OTP.
  4. Stock exchange: This is a system which actually performs the trade and settles the transactions physically. It is usually the central trading institution on which the companies get listed and their price and metadata is stored and updated. For example Nasdaq or BSE etc.
    The above four are going to be the actors who interact with our application in some way or the other. The rest of the components would be internal to our application. Based on the above requirements and actors, let me put forward the proposed high-level component diagram. We can then discuss the thought behind each component and why we choose them.


Figure 1: A proposed HLD or a real-time stock trading application

    Let us try to understand the flow of the above figure 1. Let us move from left to right. Our first entity which interacts with our application via the UI is the user. The user can log in to the UI based on the login and account management component post signing up. The data for login-related stuff can be stored in a NoSQL database as one might want to collect additional information about the user like the browser he/she used and other such metadata. Capturing them in a traditional SQL database might not be best as the information columns might change with time. Also, such a system needs to be distributed based on governing policy of the country which is better suited by NoSQL databases. Still, in some cases, a traditional SQL-based database would also do just fine. The application UI stores most of the aggregated data and metrics in its own cache to show the graphs and metrics which are updated based on the pull mechanism for the cold data-based graphs. Now where does this data that is pulled come from? It comes from the second actor, the stock exchange system which pushes the events constantly to our trading and monitoring sub-systems via SSE (Server Sent Events). The monitoring sub-system retains the data for some time and sends it to some cold store after some time based on its purge policy.

    We have separate services for portfolio management, payments (made via wallet and client banks), trading (which stores and retrieves data from the repository actor.), wish listing service, and other basic infrastructures like the load balancers and queues in place. The lower-level implementation of this architecture could be done via an event-driven architecture partially. 

    For real-time client-facing metrics and dashboarding, we may use a combo of Apache Pinot as the database (provides blazing-fast queries and can handle huge load and scale on demand.) and d3 or Apache superset for dashboarding UI.

This is my high-level attempt to design a real-time stock trading application. Its lower-level implementations are another layer of complexity that I leave to the user or maybe for some other day. Obviously, this is not the best model maybe and any improvements would be appreciated in the comments below. But this is a good model, to begin with, and get things started. Happy designing! 

Thanks!
-Amrit Raj

Comments