HLD attempt to design our own music player app!



     Today morning I just woke up and opened my design editor! The technical artist within me was very excited and inspired to try to design something nice and I choose to design a simple music player app. Thought of sharing it with the larger audience for their views, feedback and ideas. Let's begin with the thought process behind the whole app. What could be the requirements for what I call the MLP (minimum lovable product!) version of the app. Let's list the functional requirements down :

  1. Users should be able to see the list of songs available. 
  2. Users should be able to search for their songs.
  3. Users should be able to play the song and while playing, the details of the song like the artist, time duration etc should be visible. 
  4. Users should be able to do basic operations while the song is still playing like play, pause, move to a particular part of the song, loop etc.
  5. Users should be recommended the next songs they might want to listen to based on the way they have interacted with our app in the past.
Future requirements (maybe which could be covered in milestone 2 or so) may include a premium model and rate limiters and maybe offline song downloads etc. Apart from these functional requirements, we must agree on some non-functional requirements as well : 
  1. The app should have low latency while playing the song.
  2. The app should be highly available. Consistency is not as important a latest song update may take a few minutes to propagate to all users.
    Based on these very high-level requirements, let's try and come up with some components which we might need intuitively! Based on the requirements, I can see that we would surely need these components : 
  1. A database to store the song metadata (artist info, length of song etc.) so we can show them when the song plays. The database should support great searching capabilities and horizontal scaling for high availability. It might not need to perform complex algorithms so we may eliminate an RDBMS and may go for some NoSQL solution. My bet would be on Amazon DynamoDB which scales very well and needs minimal maintenance from our side. Besides it will be easy to develop quickly using it. Also, it is easily integrable into other AWS services. Let's pick that up!
  2. A data store to store the songs in file formats/object formats. I am immediately reminded of Amazon S3 which is secure, scalable, offers great storage capabilities and very handy APIs. Let's go with that for this case.
  3. A data lake to store data for training our recommendation ML models. Redshift may be a great store for this. Let's go with that. We will use Sagemaker for annotations and generation datasets by human workers.
  4. Caches: We would need two different types of caches, one for caching the metadata which will be a simple key value-based cache and the other to store file-related data for the songs for which we would need a file-based cache solution. Let's consider Redis for the former and Amazon file cache for the latter. Apart from this, we may also use some cache in the browser/mobile app as well.
  5. For the services which will interact with the components and serve as the backend of the various systems, we may use a spring-boot-based application in Java. We may need a Python-based backend for training ML models and returning the results of recommendations.
    Let's dive deeper now and see what all APIs would be needed for us to have so that we can create our MLP! Looking at the requirements, we would need at least the following APIs:
  1. getSongMetadata: The API would be able to search for the metadata based on songId or artist name. This will call a minor API to map the song name to songId. We are assuming no autocorrect exists as of now and the user must type the exact name of the song. It is a pain and may be corrected in milestone using an elastic search based recommendation system.
    1. Endpoint: GET /v1/getSongMetadata
    2. Payload: {
           "songId": String (Optional),
           "artist":  String (Optional)
    3. Return: {Song metadata json} or 404/500 HTTP error code
  2. getPlayableSong: A paginated API which fetches the data of the song in the form of a byte buffer or data string used to play the song. The API can be called every 'n' seconds to fetch data for the next 'm' seconds where m > n.
    1. Endpoint: GET /v1/getSong
    2. Payload: {
          "songId": String,
          "limit": Number,
          "offset": Number
      }
    3. Return {"buffer": Buffer of song data} or 404/500 HTTP error code
  3. getSongRecommendation: The API gets a list of recommended song names and song IDs based on user data and the last few song IDs.
    1. Endpoint: GET /v1/getSongRecommendation
    2. Payload: {
           "username": String,
           "songId": List of String of songIds
      }
    3. Return {List if song names and ids} or 404/500 HTTP error code
   Based on these APIs, I believe we should be able to support all the operations needed based on our requirements for the MLP. I am not diving deep into the table structures for these and leaving it to the creativity of the reader based on the APIs, requirements and the HLD component diagram. Lets now try to come up with an HLD component level diagram for the whole flow.


     The above is one of the possible ways to design a component-level HLD to design such as a system using the components that we have decided on. The system would easily be able to support our APIs at scale. The system would be highly available and eventually consistent. This is just one of the approaches that I could come up with. We can have better versions of things and better choices of components and maybe more components (Like alarms, dashboards etc). I am still a learner and readers' feedback and comments are welcome. Till the next post, happy designing. Bye!!

Thanks,
-Amrit Raj

Comments