from Guide to the Job Hunt on Jul 30, 2023
How to succeed at system design interviews
‘System design interview' sounds daunting no doubt, and the actual experience is no less intimidating: just a few words and a vague prompt, like "Design Spotify"… and you're expected to pick up the ball immediately. Turns out, these questions are a little less open-ended than you might think, and there's a basic formula that the best candidates use, that you can use too.
Tips for the coding interview are always the same: practice leetcode. However, tips for the system design interview are not as common or as helpful: Study databases? Memorize a few definitions for terms such as "load balancer"?
In fact, what even is a system design interview? Word-of-mouth information is sparse, but as it turns out, there's just as much structure to a system design interview as there is to a coding interview.
What is system design?
Generally speaking, the goal is to test your ability to break down an ambitious goal into bite-sized chunks. Your response should describe how to architect and organize databases, servers, and client-side applications into a cohesive whole. The target application could be a hotel management tool, travel booking website, or movie streaming service — each application has its own unique set of challenges.
Don't just ramble. Practice thinking and drawing.
Regardless of the task, the application can become gnarly and complex. Unfortunately, it's too much to keep in your head all at once, and if you can't keep track, your interviewer definitely can't either.
As a result, above all else, your goal is to learn how to visualize on-the-fly, to stay organized for your own sanity and to keep your interviewer up-to-date on your design. To do this, tackle the following todos:
- Right now, pick a tool to diagram with. There are certainly purpose-built tools for the job, translating some form of plaintext into a beautiful diagram1. These tools are nifty for diagramming on the job, but you shouldn't use these for interviews, as you'll waste time fiddling with syntax. Use a basic drawing utility. Even Google slides works. Get familiar with the tool, and practice drawing a few flow charts. Pick your favorite icon that represents a database.
- Right now, schedule practice for diagramming and talking. Even if not at the same time, practice the two together in a single paragraph. Ideally, find a friend to play a rubber duck. You don't need a seasoned expert in either giving or responding to system design interviews, to practice.
To get a rough idea of how to use your diagram effectively — and by extension, how to produce an effective diagram — try the following exercise.
-
Diagram-less baseline: Pick a flowchart or a technology tree of some kind. Once you've found a flowchart, try communicating the position and path to a node that is 3 levels deep.
- For example, we'll use Minecraft's technology tree: In this game, you can mine stone, chop wood etc. and use those raw materials to craft tools. You can then use those tools to craft or build more and more items. A diagram that shows you which items are required for building or unlocking other items is called the "technology tree". See this example diagram from reddit.
- In the Minecraft example, you could try to explain how to obtain a "torch". To do so, you need wood, and coal. Coal can be obtained by smelting wood. However, to smelt, you need need a furnace, which you can build after mining cobblestone. In short, this verbal description gets fairly complicated, fairly quickly. If you're already familiar with Minecraft, try picking a deeper node, such as a minecart.
-
Re-explain using a diagram: This time, re-explain the position and path to a node that is 3 levels deep, but by drawing a diagram to support your explanation as you go. Now that you've attempted an explanation once, you should have a rough idea of what your diagram needs to include.
- In the Minecraft example, we can explain from right-to-left, starting with the individual ingredients needed for a torch (wood and coal) and then how to obtain each ingredient. We would also update the diagram accordingly, as we go, using a drawing like the below.
- For an added bonus, repeat this exercise but record yourself. This may seem cringeworthy… and I bet you it will be. But, think of it this way: Would you rather you see and fix the cringeworthy interview first? Or would you rather your interviewer see the cringeworthy interview?
graph LR; Wood --> Torch Coal --> Torch Wood --> Coal Furnace --> Coal Cobblestone --> Furnace
Once you've done the above, you should now have a rough idea of how diagramming can help make your explanations more understandable. Now, practice just this singular skill of drawing and talking at almost the same time.
Practicing both drawing and talking is important for two reasons, both of which are critical to clear communication in your system design interview:
- Ensure you can context switch between the two, from drawing to talking and vice versa. It's all too easy to lose your train of thought. You don't need to draw and talk at the same time, but you at a minimum need to switch between the two.
- Practice drawing effective diagrams that help clarify your thoughts — not just random doodles. Also, it's easy to evaluate the effectiveness of your diagrams on your own: A day later, revisit the diagram, and see if you can understand it. Use the exercise above — diagram-less explanation vs. explanation with diagram — and see if your second recorded talk was actually clearer.
With the above practice, you should be more familiar with diagramming. This way, you won't be caught off guard, and you'll be able to bring much needed clarity to an otherwise confusing interview.
Prepare for the unspoken agenda: Scope, capacity, high-level, low-level.
The unspoken agenda is a tried-and-true response template, accounting for at least half of the interview itself. Even for senior engineering candidates, this structure holds true for at least the start of the interview.
- Interviewer poses question. Interview provides a very simple prompt, such as "Design Google".
- Define the scope. You narrow down the scope for the application by defining core use cases to focus on.
- Estimate capacity. Estimate the type and amount of load that your application will need to support.
- High-level approach. You cover the general structure for your application, along with your database schema.
- Low-level details. Poke at different pieces of your design for improvements or potential issues.
Let's now break down each of these steps in detail.
Step 1. Interviewer poses question. Just like in How to succeed at AI design interviews, this is a simple short prompt.
- Interviewer may provide a few requirements out of the gate, such as the number of concurrent users you should expect to serve. You should be prepared to estimate these statistics, especially if you're being asked to re-build an existing application.
- The design interview at this point is completely open-ended. You're expected to structure your response and the interview more broadly. As a result, knowing these steps is especially important.
Step 2. Define the scope. Make a list of features and use cases for the application. Then, narrow down the list to a minimal set of core features.
-
List features and use cases. At this point, your goal is to cover all major features of the application; the key challenge in the question is likely one of the application's key features. With that said, don't waste time on being too thorough. You'll need to narrow down this list in a second anyways.
- Say you're asked to design Spotify. You would list different features such as playing music, browsing recommended songs, and organizing playlists. There are certainly other bonus features, but they aren't necessary to discuss: shared playlists that multiple users can add to, fine-grained search options, and social features such as friends and following artists. We'll explain in the next part how to distinguish between features worth mentioning and those that aren't.
- Say you're asked to design YouTube. You would list different features such as playing videos, browsing recommended videos, and organizing playlists… This sounds suspiciously familiar to Spotify. In fact, the rest of the features above are all very similar. There are two key differences: (1) YouTube streams much more data, and (2) YouTube shorts presents a different set of challenges.
-
Narrow down the list. Focus on just the core features and use cases for the application. This focus should preserve a key and unique challenge that the application faces.
- Practice identifying a challenge unique to the application. This unique challenge is possibly the focus of the question. Otherwise, every system design interview would be identical. Your first task is to know what the response common to all of these applications is; then, your task is to adapt to these application-specific challenges.
- For Spotify, one key challenge could be the massive imbalance in reads vs. writes. In short, very very few writes occur, but the service needs to support a large number of reads for a few disproportionately popular songs.
- Say you're asked to design YouTube. When compared with Spotify, serving users requires streaming a lot more data. You're looking at a 10-30x increase in bandwidth — from a 320 kbps song to a 8 Mbps video. Writes are also (likely) much more frequent than in Spotify's case.
- Say you're designing TikTok; you're serving much shorter content, meaning you need lower latency recommendations but since users can't seek in a video, streaming content is at least simpler. If you're designing Google, now you need infrastructure for offline crawling and ranking. If you're designing Messenger, you now need to consider issues with the real-time nature of the app — offline users, race conditions, and more.
At this point, you should have 3-4 core features to focus on.
Step 3. Estimate capacity. Understand the scale of the data and traffic that you're handling for the application.
-
Usage estimate: Estimate how many users and data the application contains as of the present day.
- For example, for YouTube, estimate the number of videos and users. We may ballpark this at 1 billion videos. Google boasts 4 billion users, and we can approximately say that half of those users use YouTube, making 2 billion users.
- To get a better sense of these numbers, check out online resources for the number of users per platform, for a number of popular websites. For example, Twitter has 200 million unique visitors.
-
Storage estimates: Calculate how much data is stored for the application.
- For example, for any video service, one frame from a 720p video would be 1280 x 720 or 1 million pixels. An RGB video with 3 color channels would bring the total to about 3 million values, where each value lies in the range
[0, 256)
, representable by a byte. Each frame is thus about 3 Mb, and a 24 fps video requires 72 Mb per second. If we assume MPEG compresses by about 40x, each video requires 1.8 Mb per second. - If the average video is 10 minutes long, each video takes about 1 Gb to store. Combine this with usage estimates, to get 1 Gb per video multiplied by 1 billion videos, obtaining a storage estimate of 1,000 Pb or 1 exabyte. We can assume that user data storage is negligible compared to this.
- Now that we've calculated rough estimates here, you can simply reuse these statistics in interviews moving forward — namely, that a 720p video requires 1.8 Mbps to stream or 1 Gb to store, assuming 10m average length. This detail of computation is a bit excessive to repeat in every system design interview.
- For example, for any video service, one frame from a 720p video would be 1280 x 720 or 1 million pixels. An RGB video with 3 color channels would bring the total to about 3 million values, where each value lies in the range
-
Bandwidth estimate: Calculate how much data is streamed to serve a single user for a particular use case.
- Above, we deduced a video can be streamed at 1.8 Mbps. We want to add some buffer, as 1.8 Mbps is the approximate minimum, plan to stream at a slightly higher bandwidth of maybe 2.5 Mbps.
-
Reads vs. writes: Other estimates may also influence your design, such as the ratio between reads and writes, as well as latency requirements for reads and writes.
- For example, we may assume that for every 100 hours of videos watched, 1 hour of video is uploaded. Given this ratio, we can then say that video reads are more important than video uploads, if our goal is to maximize service for the majority of users.
Step 4. High-level approach. This is where you outline the several tiers of your application: the client, logic tier, and database. Then, outline the database schema.
- The overall architecture is standard: Setup a three-tier architecture with the client, a logic tier, then a database. This first version of the architecture will look highly similar if not identical across many of your system design interviews. This will eventually morph depending on the question, but this is a safe template for the first pass. Optionally go through the flow for a read and a write.
- Outline the entities and their relationships in your database, starting off with a entity-relationships diagram. If you recall from your databases course, assign identifier columns according to the appropriate relationship: one-to-one, one-to-many, or many-to-many. You can review database design in SQL 101: Databases for Beginners.
- For example, for YouTube, we would have
user
s who can uploadvideo
s. To facilitate recommendations, we could also store a user'srating
of a video.
erDiagram USER ||--o{ RATING : submits USER { int userId string name } USER ||--o{ VIDEO : uploads VIDEO { int videoId string title string resourceUrl } RATING }o--|| VIDEO : for RATING { int isPositive }
Step 5. Low-level details. This is where you begin making improvements and anticipating issues, adjusting the design as needed. If you haven't already, tailor the design to your application's key challenge that we identified previously.
- This is where the interview can become more open-ended, as the interviewer can push you towards different aspects of the design. With that said, if you anticipate issues or potential improvements in your design, you can steer the interview towards a topic you're more familiar with.
-
At this point, improvements center on one of three pillars for any application. To practice, pick a random tier in your application, and pick from one of the below pillars; brainstorm how you would robustify a tier in one of the following ways:
- Reliability: Fault-tolerance for issues such as user error or server failure — improved with resource redundancy and built-in failure mitigation.
- Scalability: Sustaining performance with changing demands — scaling up your system to meet higher loads and downscaling your system to avoid wasting resources.
- Availability: Amount of time that service is accessible by users without failure — ensuring users can use your application anytime and anywhere.
-
For example, say your goal is to maximize availability of your application, and your prompt is to design Google. Crawling the web is a compute-intensive process, and unfortunately, running the crawler in the same logic tier that serves search queries can hurt availability. Quite simply, the logic tier may be busy with crawling, when a user performs a search query. As a result, split the architecture, so that a set of asynchronous workers crawls and indexes the web, independently of the search query logic tier.
In summary, prepare for the above steps in your system design interview. Just by knowing these steps alone, you're taking a large step forward in your own system design abilities — both inside and outside of the interview.
Practice, practice, practice.
Just as with any other interview, the key here is to practice. Definitely cover the technical topics and know how to synthesize them into a cohesive application, but most important is to communicate that design clearly.
- Practice drawing and talking. As we discussed above, this is a lot harder than it sounds. In fact, you're really juggling three tasks at once: thinking, talking, and drawing. Practice interleaving these three tasks seamlessly.
- Practice identifying application-specific challenges. Anticipate the purpose of the question, by identifying the differences between the application and the "vanilla" design. Where would the "vanilla" design fail? Where would performance suffer?
- Practice translating challenges into design decisions. After identifying the challenge, use this knowledge to alter your high-level and low-level design decisions. Once you've determined the slots you need to fill — for example, the database type — determine what options are available and a quick summary of when to use which. For example, use NoSQL for large amounts of data; use MySQL for structured data with potentially complex relationships between different entities.
To practice, pick random applications and run through the above process. Outline the vanilla system design, then tailor your design using the options you've identified above.
Now, you know both what and how to practice. Record yourself responding to a system design prompt, or find a friend to practice together with. I do want to reiterate: You don't need an expert to practice with. A large majority of your mistakes, especially starting off, will be obvious ones that any of your colleagues — and especially you, being your own biggest critic — can pick up on. Use this to your advantage.
← back to Guide to the Job Hunt
-
There are a variety of effective diagramming tools for system design, such as dbdiagram.io for entity-relationship database diagrams or sequencediagram.org for sequence diagrams, flowcharts, and processes. ↩
Want more tips? Drop your email, and I'll keep you in the loop.