HARD
SYSTEM DESIGN

How to Solve Design LeetCode

Written By
Adam Bhula

What is the Design LeetCode Problem?

The Design LeetCode problem asks us to create a coding competition platform with a leaderboard and an execution environment. This problem is very similar to the "Design Online Judge" question and shares many similarities however it is more akin to a constantly running competition rather than a one-off. Users should be able to submit code and receive a pass/fail response as well as a score to know how well they performed. This problem is very open-ended and has no one true solution. That being said, this is a common problem asked at top FAANG companies like Amazon, Google, and Meta. Here we'll showcase some key points to discuss if you come across this problem in an interview.

An Example of the Design LeetCode Problem

Design a coding competition platform with a leaderboard and execution environment.

How to Solve the Design LeetCode Problem

To tackle this problem, let’s identify our functional and non-functional requirements first. For this particular question, it’s very important to discern what your interviewer prioritizes in respect to answering this question. It’s also vital to outline the scope of the problem, there is a lot of overlap on the back-end for Design LeetCode and another similar problem: “Design Online Judge.” This distinction is typically important at the L4 level, but less so at the senior level when discussing the solution, as one is a continuous running competition while the other is a one-off. Now let’s look at our requirements:

Functional requirements:

  • Users are able to submit and run code.
  • Users should be able to run their code against some manually provided inputs.
  • Problem server tests whether user code passes.
  • Users get some feedback on their code — pass/fail, as well as performance metrics on space/time efficiency (Leetcode itself offers nice “compare yourself to others” infographics, which are worth mentioning during the interview as well!).
  • User’s code is saved and can be pulled up again where they left off.
  • Session handling, users can navigate away from the site and return to their code later.
  • Safe code execution (containerize everything — protect yourself from rm -rf / stuff).
  • Horizontally scalable user code runners / workers (same as in the image filtering problem above!).
  • Leaderboard (we can inquire about the type of leaderboard, but this doesn’t change our design much).
  • Support various programming languages and libraries, including compiled languages.
  • Test input files matched against output files, and support custom testing algorithms, as not every problem’s tester is just about comparing the actual output against the correct one.

Non-functional Requirements:

  • The in-browser IDE, code completion, syntax highlighting, jumping to errors, etc.
  • Profile creation.

Extra notes:

  • Consider rate limiting and how to best implement it.
  • Important to consider failure modes for the message queue and comparison of existing services. Refer to our discussion here (links to previous question)
  • Consider how to store user submissions.

Main focus

A problem like this has many topics to cover, so candidates often struggle to demonstrate clear separation of concern. Ideally, a candidate should be able to accurately identify what the main concerns of the system would be and allocate most of their time to these concerns. By doing this, a candidate can transform their proposed solution from a jumbled mess of priorities that is just passable at an L4 into a strong L5 hire.

With that said, the two pillars of this problem are sandboxing user code execution (containerization/VMs are keywords to mention!) and the task queue to efficiently scale execution across multiple workers.

Sandboxing:

To ensure safe code execution, sandboxing is a critical component. By using containerization technology like Docker, isolated execution environments can be created for running user code. Each user's code runs within its own container, separate from the host system, providing a sandboxed environment. Techniques such as resource limitations, network isolation, file system restrictions, timeouts, and execution limits help enforce security within the containers. This would scale independently and is orchestrated like a task queue.

Storage:

A reliable storage system, such as a SQL database, plays a crucial role in the coding competition platform. It enables users to revisit questions and easily access their code submissions. Since the user-submitted code snippets are far greater than the metadata, to keep the SQL DB small in volume, the user-submitted code is stored on S3, and from the SQL tables it is referred to by some SHA256 hash (or any other hash that has a very low probability of a collision). Along with this SHA256 hash, in the SQL DB we also store the relevant metadata, such as problem IDs, user IDs, and timestamps. Since a second or two here and there does make a difference for this problem, it would suffice to always deal in server timestamps. Additionally, the storage system supports the leaderboard functionality by storing user rankings, performance metrics, and historical records. This allows the leaderboard to accurately reflect user standings and provide percentile information. With this in place, users can conveniently access their code, review past submissions, and monitor their progress on the platform.

Session IDs:

In addition to a reliable storage system, incorporating session IDs for virtual machines can improve our system. Session IDs can be assigned to each user's virtual machine instance, allowing individual sessions to be tracked and managed. This ensures that users can easily resume their work, access their code, and retain their session state, even if they navigate away from the platform.

Static Content:

Your interviewer might ask you about how you would retrieve and display static content such as problem statements. This gives you the opportunity to lead the discussion, clearly stating that the retrieval of problems is trivial, that it doesn’t need much time spent, and this can even be wordpress, static pages. You can mention CDNs, S3 CloudFront, and spend a few minutes explaining how it works and note that they are cheap and low latency. Problem statements and the leaderboard can be handled by a single server with an estimate of most likely under 1000 QPS.

Caching:

When it comes to caching content, let’s explore what would need to be cached. First of all, static content such as problem statements, test cases, and images related to problem statements. These wouldn’t change often and are accessed quite frequently, so we would want to load them first. For our leaderboard, we wouldn’t want to cache the whole leaderboard (every single user submission ranking) but only the top N submissions. N in this case can be any arbitrary cutoff point like 10 or 100.

A note on DB type choice. Once user submissions, problem statements, and test data / test code are taken out of the picture, the total volume of data to work with is quite manageable. Besides, certain cross-[DB]-key guarantees are quite relevant: for instance, if two users, A and B, have submitted correct solutions to the same problem, and user A’s submission came in a couple seconds later, while it took longer to be rendered correct, it is plausible to expect the user A to be displayed above the user B in the leaderboard. Since we need to respect “data races” of this kind, a SQL DB would offer to handle them organically, while making this happen in a NoSQL storage would require extra work on the developer side (and lead to more development time spent, extra bugs, etc.).

Your interviews have a strict time limit, so do not waste time on low-priority and trivial topics. Let’s discuss the crux of the problem — the task queue and containerization — sooner!Your interviews have a strict time limit, so do not waste time on low-priority and trivial topics. Let’s discuss the crux of the problem — the task queue and containerization — sooner!

Message Queue (Kafka)

To avoid putting a heavy load on our system, we’ll need a message queue to handle our user submission file processing asynchronously. One possible option is Kafka, a distributed streaming platform known for its high throughput and fault-tolerant design. It is commonly used for real-time data streaming and processing. Kafka's architecture allows for scalable and fault-tolerant message processing, making it suitable for high-volume and distributed systems. When it comes to grading user submissions, Kafka can be used as a job queue system by treating problem grading requests as events or messages. These events can be published to Kafka topics, and multiple worker instances can subscribe to these topics and process the messages concurrently. Kafka's partitioning and replication mechanisms ensure fault tolerance and scalability, so it’s suitable for handling large volumes of problem grading requests. Overall, Kafka offers robust job queue capabilities and allows for efficient distribution and parallel execution of problem grading tasks.

Note that this logic is entering a very deep pocket of the SysDesign interview conversation, and it is highly unlikely that navigating it is a requirement for an L4/L5 SD interview. For L4/L5 it would suffice to say that, since occasional, very rare, double-processing is not a problem, a RabbitMQ-based solution is also a very feasible option, and, in practice, both Kafka and RabbitMQ (or even a PostgreSQL table with an “autoincrement” column that would store the set of tasks to complete!) are perfectly defensible here.

To find out more about the failure-modes of Kafka and its limitations read our post here

[Footnote from authors: While Kafka is not the perfect solution in the long run — ref. the “fast tasks stuck in the queue after the slow ones, with other workers idle” problem — we do recommend Kafka as an L4/L5 solution to this interview question. It is easier to justify for a candidate, and it’s still very much above the hiring bar. Eliminating the shortcomings of Kafka for this particular problem is an L6+ problem, and unless you are comfortable getting into the weeds with your interviewer, this is not the path we advise to take for an L4/L5 candidate. ]

Leaderboard:

For the leaderboard presentation, there are several options to consider. Sortable tables can be used to display the leaderboard, allowing users to sort by criteria like runtime, space complexity, or overall performance. This would be a small-ish, low-traffic, high read-to-write-ratio service. This would be our “global” leaderboard (aka “best solutions”) listed as top 50/100 or so. This should be strongly consistent, of course.

Scalability/Reliability:

By leveraging cloud infrastructure services from major providers like AWS, we can make our system more reliable and resilient. Utilizing multiple availability zones or regions ensures high availability and fault tolerance. Implementing automated backup and disaster recovery mechanisms protects against data loss and enables quick recovery. We can use CDNs to distribute static assets, reducing latency and improving performance for users worldwide. Static assets in this case would refer to problem statements, images related to these problems, test cases, etc.

Non-functional:

In terms of syntax highlighting and module import, the platform should provide features that enhance the coding experience. Syntax highlighting helps visually distinguish code elements and improves readability. Supporting module import allows users to leverage existing libraries and functionalities within their code submissions.

Senior-level capacity planning:

Last but not least: if you feel like showing off with L5/L6+ capacity planning skills, it might be worth mentioning cost optimization. Generally speaking, judging user-submitted code is very CPU-intensive, and the bill can easily run 1000’s or 10000’s of dollars in AWS EC2s or Lambdas. The particular case of Leetcode user-code-running backend is a strong candidate for purchasing custom hardware, as this could easily save the company hundreds of thousands of dollars per year, assuming the service we are building is popular enough.

About interviewing.io

interviewing.io is a mock interview practice platform. We've hosted over 100K mock interviews, conducted by senior engineers from FAANG & other top companies. We've drawn on data from these interviews to bring you the best interview prep resource on the web.

We know exactly what to do and say to get the company, title, and salary you want.

Interview prep and job hunting are chaos and pain. We can help. Really.