Hipster Samurai: Hello?
Deliberate Alligator: Hey, can you hear me?
Hipster Samurai: Yes. Can you hear me?
Deliberate Alligator: Yeah. Hey, sorry I'm a couple minutes late. I was having some audio issues connecting. How's your day going?
Hipster Samurai: Pretty good. Just practicing interviewing, but how about you?
Deliberate Alligator: Pretty good as well. Yeah, looking forward to practicing some interviewing with you. Awesome. So today it looks like you're up for system design, right. And this is specifically a mentorship session. Is this your first system design or mentorship session? Have you used interviewing.io before.
Hipster Samurai: Yes, I did a mock system design a couple of months ago, but I needed some practice, so I've been reading System Design Interview Volume Two. I implemented a stock exchange. One of the things I've had a lot of practice since then.
Deliberate Alligator: Okay, sweet. So I think the mentorship sessions, I don't know if you intended to specifically do this one, but they're a little bit more kind of open ended. I like to ask kind of what you want to get out of the interview. We can kind of tailor it however you want. I can give you it looks like you're a pretty experienced person. Like ten plus years of experience is what's written down. Is that accurate?
Hipster Samurai: Yes.
Deliberate Alligator: Okay, so I'm assuming you're kind of going for staff plus roles, like staff or principal engineer roles or senior engineer roles right now.
Hipster Samurai: Senior Engineer. Yes. And eventually staff.
Deliberate Alligator: Cool. Sounds good. Yeah. So, I mean, we can work through kind of like just the system design question together. Do you have anything, I guess, in particular that you would like to get out of today's session before we jump in?
Hipster Samurai: Yes. My goal is to work with you through a system design question, so I would like it to be like a normal interview almost. So we just say, okay, do this. You start with do X, Y, and Z, and then I start and then we keep go for a bit and then just iterate together instead of just full presentation mode. It's like an interview where we're also collaborating together more.
Deliberate Alligator: Yeah, I don't do any of those kind of interviews anyway, so I think that'll be always do that. Have you ever had someone kind of walk you through a generic strategy for system design interviews? Because one of the things that people struggle the most with, I find, is time and knowing what stuff to talk about and how to kind of make use of that bi directional communication. Do you have kind of like a strategy when you're going into these things? If not, I think we should spend just a couple of minutes to talk briefly about that.
Hipster Samurai: Yes, I've been watching a lot of these videos, but I'm still working on that and I've been just doing some practice sessions with myself. But I would really like to go over that. That would be really helpful.
Deliberate Alligator: Okay. I'll send you a link to a blog post in the main chat here. Just so that you can take a look at it later on. You don't have to read this right now. It's pretty lengthy, but I think there's kind of four main steps and we can put them here on our whiteboard and excalibra so that we kind of do them together over time. But the first step is just doing a basic outline, right? So coming up with the requirements, asking lots of questions. The second step is getting the high level design and making sure that you get sign off on it and then you go into more detail. And then lastly, you make it scale and make it operational, things like that. You do follow up questions, right?
Hipster Samurai: Yeah.
Deliberate Alligator: The super important thing, I think where people get really lost is how much time to spend in each one. So we've got about an hour today and let's spend maybe like 45 minutes or something on this together and then we'll spend more time on other things. But you're probably looking at 1510, maybe like ten and ten, something like this, maybe actually closer to this. Right. So this is roughly how you'll want to spend 45 minutes here, right? And then obviously there's more kind of specific things. That blog post will walk you through back of the envelope math and tips and tricks and stuff like that. But I think let's just maybe start here. And you can use this as kind of a cheat sheet as we go through the interview and then you can look back at this video and kind of time yourself and how things went. Does that sound good? I know that's a super high level overview. We can spend more time at the end in more detail.
Hipster Samurai: Okay, yeah, that sounds good to me.
Deliberate Alligator: Cool. Any questions, I guess before we jump in?
Hipster Samurai: Well, I think the questions I would have are mostly things we could answer by jumping in, like the nature of what we would discuss in each section. So I think let's jump in.
Deliberate Alligator: Okay. Yeah. So I want us to design a job scheduler for our cloud service. And it'll be used by any customer in the world, so it'll be a public facing service. And what I really want to have customers be able to do is they can specify a job. And a job is kind of any kind of code or algorithm that they want to upload. And I want them to be able to run that job either as a one off. So kind of like a manual trigger or on a schedule. So like every Tuesday at 02:00 P.m. Or something like that.
Hipster Samurai: Okay, I'm going to write that out on a text file. Job scheduler, public facing. I'm going to make these kind of like the functional requirements because my understanding of the first section is outline use cases and constraints to the system. And I know that the first part is often just like, what are we building the overview job Scheduler, public facing customers post A Job that Can Be Some kind of Code. I will have a question about that in a second and can make it one off jobs or scheduled, maybe even repeat. Does that include everything that I mentioned? Okay. And then a question about posting a job. Does that mean they upload code?
Deliberate Alligator: Right? Yeah, I'd like us to discuss that a little bit. Ultimately, what I was envisioning is I want customers to be able to put in kind of any code of any language that they want. But I'm open Minded as to how we should create a Surface that allows them to do that in the best possible way. But, for example, maybe I'm writing in Python and my program is like, Print hello, world. And then I go into C sharp and I write a program that uploads a file into Amazon s three. AWS s three. And then downloads that and then puts it into Azure Blob or something like that. So I Want A Lot of flexibility in terms of the computation that a customer is going to upload.
Hipster Samurai: Okay. And this should be pretty free flowing. And customers can upload anything. This should be fully automated. So I Think We're going To Need some Kind of Virtualization for this because we don't want customers to accidentally upload something that breaks other Things. So I'm going to create a section called Non Functional Requirements for this. Because assuming you just want to be able to upload Anything, that's pretty clear and there might be some limitations on that, but definitely isolation and security. Should there be a length of Time? Like maybe one of the things we might want to build is customers can pay more to run Code that runs longer.
Deliberate Alligator: Yeah, I love that idea. Let's start off really simple and just say that the maximum execution time for a job is 24 hours. And if we have time at the end, we can design more of these bells and whistles.
Hipster Samurai: Okay. And the size of the customer Base now matters a lot. If a job can run 24 hours, it's a lot of resources. Could we say maybe 10,000 customers at most?
Deliberate Alligator: Yeah. I want us to have 100,000 customers.
Hipster Samurai: 100,000 customers. Okay. And each customer can run one job. Is that reasonable?
Deliberate Alligator: Yeah. So let me give you some kind of stats of our system. So we have 100 million jobs per customer and about ten jobs per second per customer ran in the system.
Hipster Samurai: Okay, cool. So it's 1 million jobs per customer and this is just the number of possible jobs.
Deliberate Alligator: Right.
Hipster Samurai: And ten jobs per second ran per customer. Okay. This is a lot of yeah.
Deliberate Alligator: And the other stat would be that we run 100 billion jobs a day. So we have a pretty beefy system here.
Hipster Samurai: Okay. There's definitely knowing that I think this is going to controlling cost of Hardware might be one of the things we could definitely take a look at. And assuming none of these jobs are we don't provide any extra storage for the system, just code and network calls per job, is that right?
Deliberate Alligator: Let's say that they have access to a very ephemeral storage disk. Just enough to kind of write out a CSV file or something like that, but just a very nominal amount.
Hipster Samurai: Okay, so at least it's just code. Okay, that's pretty good. And then public facing, I'm assuming we're going to want a monitoring solution. So you submit a job and then you see how it is and get some kind of note. There's no notification system. I don't know, what do you have want for the public facing? Like, do you want notifications or something?
Deliberate Alligator: Yeah, I love both those ideas. Let's not spend too much time on the UI. Let's say that maybe we design a very simple API layer and we can talk about what APIs we need, but the UI is just a wrap around that. And maybe it's like a console or whatever. Maybe we have a CLI or an SDK that call into our API. And in terms of APIs, we can figure out what we want to support for eventing. And I think in particular notifications. We don't have to support that off the start. We can just give people kind of like a status API or something like that, maybe for their jobs. That way they could know whether they're successful or failed or whatever else, so they could call into that.
Hipster Samurai: Okay, so we'll provide some kind of API, but we'll focus on that in a bit. The last question is, is there any memory limitation per job? And can jobs run slower if possible? Like, if we have too many jobs on server, can we tell customers that all the jobs are just going to run a little slower?
Deliberate Alligator: Yeah. So I think there's two questions there. One question is kind of what is our SLA around how quickly a job should be executing by? Right?
Hipster Samurai: Yeah.
Deliberate Alligator: And then one question is sorry, before we handle the second one, let's take the first question. We have an interesting SLA for that. I want it to be ideally, at most a couple seconds to the point a job gets scheduled. Right. And I'll give us the stretch goal of ideally it being less than 100 milliseconds from the moment someone kind of creates a job or tells it to execute to the point it actually executing. Or similarly, you can think of that as like Tuesday at 02:00 P.m. Comes around, 100 milliseconds after 02:00 P.m. Comes around. I would like that job to be running ideally in the schedule scenario. The second question was around the memory profile of an individual job. Let's just say that at most it's like one GB or something like that to start. I think that one could be similar to what you brought up at the start, where we could allow people to pay more and increase it or something? Maybe. But let's just start really simple and say that there's one GB or something of memory is the limit, and if you go past that, then we Oom and you crash.
Hipster Samurai: Okay, got it. So I'm going to start off by saying this will be very expensive.
Deliberate Alligator: Yes.
Hipster Samurai: And I don't see a way around that. But I do think that I can create, design something that scales and allows people to upload as many jobs as they need and scales, but I may not be able to support the service level agreement. It's going to be really expensive if we actually have 100,000 customers at first.
Deliberate Alligator: Yeah, let's talk about that in the pros and cons later on a little bit. Once we have more of the high level design, we can talk about the trade offs between money and speed.
Hipster Samurai: Okay, cool. So then I guess my first question is I'm already at the 15 minutes mark, so I probably have over time. But I do feel like there was a lot to clarify with 100 billion jobs. I'm not sure. Am I doing okay so far?
Deliberate Alligator: No, you're doing great. You're doing great so far. And take what I wrote there as a grain of salt. Those numbers are going to fluctuate a little bit, depending on the question you get. This one has a lot of open ended questions, right? Because you got to define what does it even mean to upload a job, how many jobs are there, what is the profile of these things? Right? So you're asking all wonderful questions. Don't worry about being super strict.
Hipster Samurai: Okay.
Deliberate Alligator: The main point is eventually we should put a cut off time, right? So probably in about five minutes or so, right? We got to put a cut off and start getting to the design.
Hipster Samurai: Okay, well, I think I want to sketch out just what I'm going to take a stab in the dark, and I have no idea what I should say next, but my imagination tells me that if I just throw out some kind of system here with a couple of key parts, that maybe we can start to architect something. I'm going to try this. So I know we're going to need a load balancer because we have a lot of customers all uploading jobs at the same time. And we're going to need a worker farm that runs all these jobs. We're going to need a database to track those jobs. And this database is going to have a lot of customers on it. 100 billion jobs a day. If each job is a single row in a database, it's 100 billion rows in a day. So we have to think about sharding and maybe even range based queries based on day. Yeah, just like that. So a load balancer and a worker farm and also an API layer between the load balancer and the worker farm. So far I've got these four components. So job comes in, goes direct to load balancer, goes to say, a stateless API layer. A new job comes in, goes to the database and then something needs to tell the worker farm to pick that up. So let's say, I don't know, some kind of message queue, like a message broker, some kind of communications layer between the API layer and the worker farm. And whether that's a message broker or a poll based system, I'm going to write that in comms layer. And that's kind of our sort of four piece overview so far. Five piece 12345.
Deliberate Alligator: Do you think we should switch over to the whiteboard at this point? It may be easier. I'm wondering to kind of shift around the boxes and stuff.
Hipster Samurai: Oh, yeah, let's do that. This will be load balancer. You just need a load balancer if you do this stuff. Stateless API gateway. Let's see. And since everything is just I'm going to assume everything is API based for now and clients will be notified in some way. But wow, this is really nice. Stateless API gateway, comms interface like message queue or whatever it might be. And then let's see, worker farm, this actually runs the jobs. This is going to have some kind of virtualization and then a database to track jobs. Cool. I'm going to put the database to track.
Deliberate Alligator: Maybe we could put that in the center of the picture because I think the gateway and everything will kind of touch this database in this diagram so far, right?
Hipster Samurai: Yes. The API gateway gets a new job from the load balancer. I'm just going into a little bit of detail right now to just try and define what is going to happen here in the general flow so we can flush it out. Because I just want to nail down these boxes before I start to go in all these details and just make sure I've got the right boxes here. Like the gateway gets a new job from the load balancer and then goes from the database to track jobs. Whether it's a new job or checking up on a job, whatever, the customer can go and access the database. This way if it's a new job, we put it into the message queue or have the worker farm pull the database or whatever it might be. But there's some way of communicating and the worker farm does the job. And then as it's completed the job, it goes back to the database and tells the database, hey, I finished it. And then there's still a comms interface between the API gateway and the worker farm. So maybe the worker farm can go tell the gateway, hey, we're done, and then tell the client. So it's pretty basic so far. And there's a way to scale this the database because sharding and partitioning, maybe we even choose a different kind of database. You can scale the API gateway, the load balancer and the worker farm can be scaled too. And same with the communications interface, we can change how we want to do that. Does that make sense so far?
Deliberate Alligator: Yeah, this all sounds pretty reasonable. I think we are missing a potential component here. Maybe we could walk through the two different scenarios that we have. One is to schedule a job right, at say, like Tuesday at 03:00 P.m. And the other one is to run it as a one off. So I could imagine the run as a one off kind of being like execute job or something. Right, and then maybe that API hits our load balancer and then it inserts into the queue and then gets picked up by a worker or something. How about our scheduler path? That one I could picture maybe as like create schedule, we hit our API gateway, but then the rest isn't quite clear to me.
Hipster Samurai: Right. We need some kind of scheduler. So in Linux, cron is just a process that sleeps until the next thing. It's kind of almost like a while loop, but not quite. But I'm thinking I just need some kind of long running service that just called a scheduler. A scheduler. And the scheduler just repeatedly checks this database of jobs that need to be ran and the database is also tracking scheduled jobs too. So this database? Yeah, I'm going to clarify this. Jobs in progress, jobs scheduled or just progress and scheduled. Okay, so yeah, the scheduler is a loop almost that just repeatedly checks to see if certain jobs are scheduled. And there may be some issues with that considering how many jobs we have, but ultimately it's going to be something that's pulling jobs that are scheduled from this database and telling the worker farm through this communications interface to run a job.
Deliberate Alligator: Okay, yeah. Let's really quickly walk through the lifecycle here. So this is kind of like create. One of the boxes disappeared. Oh, you moved it up. Okay, cool. So yeah, we have kind of like create schedule and then maybe that inserts like a schedule record. Is that how you're thinking about this?
Hipster Samurai: Yeah, there's two types of records in the database. There's a schedule record and jobs that are in progress already. And then the jobs that are in progress, we track their state in that table. So two main tables. There's also maybe a customer table too, but for now I just want to focus on tracking that.
Deliberate Alligator: Cool. Yeah, let's spend a little bit of time to design this in more detail.
Hipster Samurai: Okay.
Deliberate Alligator: I could picture this as kind of like create schedule, then we insert a schedule record like you said and then I guess the scheduler is polling the database to get this is going to like get schedules. Is that how you're thinking about this? Okay.
Hipster Samurai: Yeah, because if you have 100 billion jobs a day and who knows how many of them are scheduled, there's no restrictions on that. It could be an issue. I'm thinking that maybe even instead of a scheduler, we might need something kind of like almost like a newsfeed generator for the worker farm. So there's something that's continuously generating what new jobs need to be ran and then something else just picks that up repeatedly and sends that to the worker farm. Right, it's two part process.
Deliberate Alligator: Yeah, exactly. So I want to talk about both of those. Yeah, definitely. There's a scalability issue. Maybe we could say that, I guess, looking back to our requirements here, maybe we have like 10 million schedules or something like that. Sounds kind of reasonable for 100 billion jobs a day. Maybe like 10 million schedules. But yeah, it's a lot of schedules and we have this goal of 100 milliseconds and yeah. How should we design our scheduler to keep up with that?
Hipster Samurai: Yeah, I think to do that, I would really like to touch on what that database is going to look like because there are so many jobs that if I just design the schedule without knowing what the data looks like, I could easily design the wrong thing. So I just want to create a table called scheduled jobs and define what this means. So first we might have a repeating job, right? And that might be a cron tab of, I don't know, the design of the scheduled jobs. And I didn't define this in the first part, but I think we could nail this down now. So let's say we have one type of job, a repeating job, and another type of job called a just one off future job. So there's two kinds, right? If that makes sense. So far, I don't think there's any more types of jobs. And a repeating job might just be once every minute, five minutes, month, whatever. It's cyclical. And one off future job is just once in the future. Does that sound good to you? I feel like this would reasonably serve our customer base.
Deliberate Alligator: Yeah. No, I agree. And I don't even know if I would call it like, one off future job. I was picturing it as like, do this thing now. It's immediate. I guess we could make it as like a future thing, too, but either way, maybe to simplify, we just think of that as like, hey, do this thing right now. We think of it as like, execute job.
Hipster Samurai: Okay, execute job.
Deliberate Alligator: Because if it's in the future, shouldn't they just put a schedule?
Hipster Samurai: Well, I assumed repeating job is only one type of thing. It's every five minutes forever until you cancel the job. But the one off future job might just be instead of executing it now, you execute it five minutes from now.
Deliberate Alligator: Okay, yeah, I like that idea. Let's focus, I guess, on the repeating job and then the kind of immediate execution. I think if we just like, the one off future job is kind of like an enhancement of the repeating job. It's like just a one off version of that, but yeah.
Hipster Samurai: Cool. Okay. So knowing that, we've got a list of jobs that are repeating forever until canceled. And so once a job is in the scheduled jobs table, it's going to just repeat forever. And then I want to clarify that, too, because the thing is, if I were to go and check every single repeating job, like, say, 10 million of them, it's going to take a long time. So I want to potentially partition this by kinds of repeating jobs. So based on our scale and cost, to reduce cost, I want to provide customers with a couple of types of repeating jobs. So let's say 1 minute, five minutes, and daily. I know I could create all kinds of weird configurations of repeating jobs, but what do you think about this? To reduce cost? I think our customers for many jobs would be able to do this just fine. 1 minute, five minute, and daily.
Deliberate Alligator: Okay. Yeah. I think going by the minute to start is okay. I would ideally, like, at the second level, actually, let's not worry about cost for now. Let's just worry about how to kind of scale the system. So I think I want us to get back ultimately to our scheduler. Right. How does the scheduler kind of work? If we have a 1 minute frequency and all these schedules, what will the system do?
Hipster Samurai: Yeah, so if we have a whole bunch of jobs that must execute every single minute, then we must go for every single scheduled job every minute and just take them all and put them into our queue to send to the worker farm. And that's what we were doing. And if we have 10 million of them, we're going to have to just create a sort of farm or split that database up and then just grab all them en masse with several machines instead of just trying to do them all at once. I could do some calculations to see how many machines, but I just want to leave that number open and just say, we need to scale the number of jobs that are scale the number of workers that are getting scheduled jobs.
Deliberate Alligator: Let's probe on that thought. I love that thought that you had. Let's just jump to, like, we have to partition the number of schedules somehow to some number of schedulers. Right. Because if I change the question to be at the second level, I know we'll end up in that position. Right. And that's ultimately what I want us to support, because to evaluate 100 million schedules in less than a second, we're going to have a problem. Right?
Hipster Samurai: Right.
Deliberate Alligator: Pretty much no matter how much we optimize our database or how much hardware we throw at the problem yeah. If we were to kind of jump to that world where we know we need to partition things to some degree, I guess, what would that look like, how would it get partitioned? What would it look like? What would it do?
Hipster Samurai: Yeah, I'm going to design a partition of scheduled jobs around, say, it's a good question because in this case I'm going to take a stab in the dark and partition around, say, user ID, because user ID could be hashed. And then let's say many users will have jobs that all need to be ran at the same time. And so there'll be some sort of continuity between where the jobs are located. And I want to partition these jobs in a database such that every shard contains a subset of jobs per user. And then I want to yeah, but then if we have 100 million jobs and it's not 100 million new jobs every day, it's just 100 million jobs, is that correct? It's not an influx of 100 million new jobs, right.
Deliberate Alligator: That's just like the average running of the system. Yeah.
Hipster Samurai: Awesome. Okay, we could partition that into, say, about a million rows per database, say, have 100 shards. And then I think a SQL query could reasonably query a million rows in less than a minute and send those to the message queue to a worker farm. Well, maybe. Thing is, to stream, let's say a job is about a kilobyte. Streaming a kilobyte is what, over a network? Maybe, I don't know, couple milliseconds or so. So if I have a million jobs times a couple of milliseconds, it might be about, it could be like a couple of hundred seconds to do all these little to stream that much data could be about a minute just to stream a million rows at least. Or two. I'm just doing some really sloppy math in my head, but I'm thinking streaming a million rows from SQL might be an issue, but maybe not. The thing is, it's got to run every minute so that those million rows have to come by fast. So how could we do this? Well, you know what, maybe one of the bottlenecks is just the amount of data that we can a million rows in a database is fine. I think 100 shards to hold 100 million jobs is fine. That's fine. That's like 100 little databases. That's cool. But the bottleneck could become how quickly can I get 1 million rows of data from each one of those shards into the worker queue? And so that's like an input output issue. And I know each database might have multiple processors, let's say even eight cores, divide a million rows by eight, that's about maybe 100 and 200,000 rows. I could totally stream 100,000 rows pretty quickly, I know that. So what I want to do is for every single worker that's pulling jobs from this database, it spawns a couple of threads that each pull data from the database or a couple of processes or whatever. And then each process is then sending data into the message queue. So it's like a combination. Of sharding and threading. Am I on a tangent or no?
Deliberate Alligator: Yeah, I think this is the gist of one really complicated problem that I'm presenting. I like those ideas. I think we can totally get away with charting. Yeah, I think your ideas are good there. Maybe we can move past this one and we'll circle back to this at the end, I think, of the interview and I'll give you some more kind of different ideas on how you can think about that problem in the future as well. Like some kind of out of the box crazy ideas. But I think what you presented makes total sense and will totally work is one solution here.
Hipster Samurai: Okay.
Deliberate Alligator: One thing that I think will be fun to talk about next, actually to kind of switch gears a bit is to talk about this worker farm. And I wanted to talk about the directions of arrows actually here. One approach that in theory you could have is the scheduler kind of immediately pushes work to the worker farm. I kind of interpret this arrow as kind of like a pull model. I'm not sure if you pictured the queue kind of pushing into the worker farm or not, but I wanted to talk about the directions of these arrows, basically and the pros and cons. If we were to talk about these three different options here, what are the pros and cons of the three different designs here, perhaps?
Hipster Samurai: Yeah, so there's a couple of ways of communicating the worker farm. There's a message queue. Pub sub might be a little weird because only one worker needs to know about a job, not all of them. And then there's a polling interface too. So I want to talk about message queue and polling. So if a worker file repeatedly pulls, let's say, the database going from here to here, that can cause a thundering herd issue if all the jobs are completed and suddenly overwhelm the database. But at the same time, polling is very easy to implement. And it's also useful if we're creating some kind of scheduler that tracks what jobs need to be ran. Worker farms can just pull the top of whatever's in a scheduler and just execute that repeatedly. But then there's an issue because two workers might be pulling for the same data and then getting the same jobs to be ran. And it's hard to coordinate 1000 workers, for example, all pulling the same database. So I would in this case prefer a message queue just because it physically makes more sense because you're just putting jobs into a queue to be ran. And the workers don't care how they're being ran, they just need to execute jobs as fast as possible. So I'd rather make this comps interface a message queue. And all the worker farm does is just pick up jobs from this queue. And so the workers are directly connected to the queue via socket. There'd be no more number three. Well, I guess if you think of a socket, this is a listener socket and this is a passive socket and this is an active listening to this one. It's just going this way. I guess if I'm describing that right, the queue just sends to each worker what needs to be done and then I can just eliminate a lot of complexity. But yeah, if I implement polling, suddenly I'm intermingling the logic of how things are scheduled with polling, and polling can be really easy to implement. But then there's a lot of caveats, like how do I make sure that two jobs aren't executed the same time? Or how do I lock rows to make sure they don't update the same job? So I think a message queue is the best Rabbit MQ work.
Deliberate Alligator: Yeah. Maybe we could say this is like rabbit MQ to start. For this third option I want to talk about in a little bit more detail. So let's say that maybe the queue has 1000 records in it or something like that, right? Just kind of inserted pretty quickly. And the worker farm, how does it know which worker to assign things to? For example, how does that interaction kind of look like? It kind of sends something to this worker farm, but presumably there's a ton of different boxes kind of in here. How does that look like? What does it do?
Hipster Samurai: Yeah, so I'm imagining all these workers connected to RabbitMQ at the same time. And I know this is sort of a bottleneck, but I know that RabbitMQ can store small amounts of stuff. I'm thinking that.
Deliberate Alligator: But so we're going to have maybe like how many sockets are we talking about? We're going to have maybe 10 million concurrent jobs or something. Right. So 10 million sockets concurrently subscribed to the queue to wrap it up.
Hipster Samurai: Queue? Yeah. So Unix can have millions of WebSocket connections in a single server. It stores all the WebSocket connections in a single table and you can see it in Process IDNet. I think there's a separate file for it and it's global to all processes, but then I think that's fine. But the issue is what happens when RabbitMQ goes down and all those millions of connections have to come back up? That becomes an issue.
Deliberate Alligator: Yeah.
Hipster Samurai: So knowing that, I'm thinking that maybe RabbitMQ might not be the best solution because RabbitMQ, it can be distributed, but it's more so for higher availability. And maybe a better solution would be something like Kafka. Even though Kafka provides features we don't necessarily need, like for example, publish, subscribe and message ordering, we don't care what order the messages are in, we just want them done as fast as possible. But Kafka does provide one thing which is horizontal scalability and partitioning. So we can put these messages into separate partitions and put these worker farms into consumer groups and then they can sort of just kind of offload RabbitMQ while it theoretically works and it has the right feature set, doesn't scale as well because literally just a link list is not a partitionable thing. So yeah, I want to do that and just have these into worker groups and then each of these little workers is in a group and then they're each subscribed to a partition in the job topic. We're going to create a job topic in Kafka and they're all pulling from this topic. And that way let's say some of these go down, that's fine. If Kafka can spin up more server instances, I believe that's how that works. Like there's producers and consumers. So in this side we have producers and if we have that know, even though we could have millions of connections in one box, let's just not do that so we don't have the Thundering Herd problem if these all crash.
Deliberate Alligator: I want to say what you're describing is closer to number two, by the way. And let me tell you why I think it's closer to number two. It's because ultimately say the specific box is trying to get work, right, this box needs to know that it can take the work on. So you're kind of saying that this box knows that it's ready to take work on. So that when I partition it based off of topic or whatever in Kafka, then it will know to kind of say, hey, give me more work or like I'm ready to take more work. Right? But if in number three, I think what's not clear to me is like say that something comes into a topic, right, maybe there's 1000 messages on a topic. If we just blindly push it to one of these virtualization instances like this EC two instance or whatever it is, it may not even be able to execute the job right. So then what happens? Does it reenter the queue or does it kind of fizzle out or does it get queued in the EC two instance waiting for the rest of the jobs to be presented? There's a bunch of issues essentially if you push directly on to the worker. But what I'm interpreting you say is closer to two, I think if I hear you right, does that make sense? Those are the differences I think in my head between two and three.
Hipster Samurai: Yeah. So two is Kafka, three is polling more or less. Let me try Kafka.
Deliberate Alligator: Well, I think two is kind of polling like the workers are polling a topic. Right? They're polling Kafka's topic. And the third one would be like if you had eventing from the topic pushing somehow the work onto a worker.
Hipster Samurai: Yeah. Eventing from the topic to the worker farm. Yeah, there's issues with both of those. Yeah. What happens if a job gets stuck so the consumer gets stuck and then the consumer, I'm not sure because I think we would need something to monitor the worker farm too. We would need something to track this database repeatedly and just query for the latest jobs that are stale and then try them again. We probably need health monitor for Kafka too.
Deliberate Alligator: Yeah, that makes sense. So we've got about ten minutes left. I want to spend maybe about two or three minutes here just talking about the worker farm as well. And then we can maybe open it up kind of like a more free form conversation. So for the worker farm, how are you thinking an individual box kind of looking like there's a bunch of different problems. We didn't really, I guess quite define how a job gets uploaded and ran in the system. Right? Yeah, like deliberate alligator uploads a job and he's like print hello world in Python. And then the worker farm somehow picks this thing up and then it executes it. How does that work? I guess what do I upload? What does the worker farm execute? How does it execute in any language so on and so.
Hipster Samurai: The I thought that it's going to be a little bit simpler, but there's some API layer here, so I'm just going to define that from the API layer. So first I want customers to create a job and then I want them to say run a job if they want to and then or schedule job. And that will have some other parameters like whether it's repeating for example, I hope I'm just answering the question the best way possible, but create job would you upload your code and that code would be saved somewhere in this database and it would create a job definition. And then running a job you get like a job ID and then you run the job. So that's a job ID. You have a schedule to schedule job. You take a job ID and make it some parameters repeating. And then you can say get all jobs. This is what's between the load balancer and the API gateway and then knowing that if you create a job I hope I'm answering this the best the way you are looking for. But create job goes to API gateway and just saves into the database. And I can define that table, but that's just a table of know with the code and the job definition, I guess.
Deliberate Alligator: What are they really ultimately uploading? Right? Like is it a binary? Is it like a list of files that represent the program? Are we doing the compilation? I think there's kind of some fundamental questions there.
Hipster Samurai: Yes. So I really want people to upload. I think the complexity of having people upload code directly could get really weird because then we have to provide code environments for all different things people want to run.
Deliberate Alligator: Right. We have to compile everything for every language, like ever with a bunch of flags and everything. Right. That seems crazy.
Hipster Samurai: Yeah. Then we'd have to create a compilation design and I don't want to do that. I'd rather craft a docker hub of some kind, or even just tell people to run jobs. Just upload a docker image, and then we could say, based on the 1GB memory limit, the image must be, say, less than a gig. So you give us a docker image, and then the worker farm would pull that docker image and run it. But then there's an issue of how long it takes to pull a docker image in the worker farm. So if a job has to run every minute and you got a ton of jobs to run and it takes 10 seconds to pull a docker image, that's an issue too.
Deliberate Alligator: Well, is it an issue? Our requirement, I think, is just the start execution of the job, right. It may not be the best customer experience to do a docker pool. Yeah. I mean, tell me about that process and the ramifications of that choice.
Hipster Samurai: Yeah. So first we got to have some kind of hub to store all these docker images, or we have to connect to docker hub. And if we have to connect to docker hub, we have to pay egress cost to AWS to pull a ton of images, and that could expand costs immensely. So I would prefer to create a docker hub ourselves, even though that might be complicated. So we might be storing docker images and say, s three, but at least assuming this is all hosted in AWS, at least we're keeping everything within. So and creating a docker hub is also itself a thing too, but ultimately we're going to have these binary images. Maybe we can leverage some of the services in AWS, such as what is that thing called elastic container service. Elastic container registry. Sorry. ECR. That would help a lot, right?
Deliberate Alligator: Yeah. So when they create a job, they upload something to yeah.
Hipster Samurai: Yeah. So you upload this to this? Could this whole service could then maybe even be sold as a managed service for customers in AWS and then expand later to other services like Google Cloud, but just make it managed AWS service so users are posting their jobs to ECR and then our worker Farm pulls the image and runs it and then deletes it after. There's a lot of pulling images. If we have 100 billion jobs a day and we're pulling 100 billion images, potentially even more often, I would love to have some kind of caching or something in the worker farm so they don't have to keep pulling jobs as much. Maybe some jobs could be tied directly to some worker farms, some workers so that they just can do them repeatedly. Like, maybe some workers could be doing just scheduled jobs, and some workers could be just doing one off jobs to reduce the amount of time we're pulling stuff from s three.
Deliberate Alligator: Yeah. Hey, so we've just got about five minutes or so left. I want to make sure that I spill over a little bit because I was a little bit late and I apologize for that. No, thank you. Yeah, let's end here. I think you've done a wonderful job on this interview. I wanted to ask you how you thought it went.
Hipster Samurai: I hope it went well. The first time I did this a year ago, I totally bombed it. But I've been really studying and I've been doing a lot of work. So I just hope that I know I did better.
Deliberate Alligator: I've been asking this question, I think, for close to three years now. I think you're probably a top three person for me interviewing on this question. Not to toot your own horn too much, but I thought you did very well. So yeah, I think you're in a very good spot for your system design in general. I do have some things that you can improve on, but I think really what we would be discussing is the difference between a senior engineer and a staff engineer. And the biggest gaps that I notice in you are kind of the next level architectural stuff is how I would describe that. Can you still hear me? The audio kind of went weird.
Hipster Samurai: Oh, no. Yeah.
Deliberate Alligator: So I think the thing that I love about this problem is you can spend an enormous amount of time literally just talking about any box. Right. And I think we can kind of dissect each box a little bit. But the main thing about the scheduler, for example, that I kind of pulled us away from is the fastest way to do pretty much any computation is to put everything in memory. Right. And I kind of felt that you haven't really ever implemented a true distributed system sharding thing on your own that's not using some other kind of technology. That was kind of the impression that I had got out of this interview. I think one of the big differences between that kind of senior and staff level systems design will be when you're talking about a scheduler kind of component like this, you're going to be able to express, like, oh, hey, maybe there's going to be like a leader election algorithm where we split the partitioning based off of this or something like that. I felt like our conversation was pretty high level. Right. It was like shard based off of customer ID. But who manages the shards? How does the shard happen? What algorithm does it use? What happens when a shard fails? All these different kind of availability concerns. And I think that's where just more depth will take your interview to the next level. Right. Like bring up like a zookeeper or something like that in a situation like that. And then you can kind of rattle off about, hey, there's a master slave relationship between these things and you can have partitions for your partitions, and so.
Hipster Samurai: On and so forth.
Deliberate Alligator: Right?
Hipster Samurai: Yeah.
Deliberate Alligator: The only kind of real fundamental thing that I think you missed in your requirements gathering is probably the size of things as well. So the real challenge in this system is just the scale. And because we have so many different objects, one aspect that we never really got into is how do you make the database scale? Like, if you have 100 billion jobs a day, eventually you're just going to run out of memory. If you were to put a back of the envelope napkin math thing of like, I don't know what a job is, maybe one or two KB. Right. The amount of memory you're going to use is crazy. And what does it look like to kind of roll over their partitions and is even realistic to fit in in memory. Right. So the strategy in the blog link that I linked at the start of the interview has a section dedicated to kind of doing that basic math. I think that's another piece that you could kind of add to your repertoire. But yeah, you called out a ton of pieces that a lot of people just don't even think about, which I really love. It shows that you're really customer obsessed in thinking about things from your own perspective. Right. And that gives me incredibly high confidence that you'll be successful on pretty much any team that you would like. A lot of people don't even know the history of Docker Hub and how Docker works or pulls, or they'll even struggle to come up with their own virtualization concepts. You have a lot of breadth, which is fantastic. The only caveat I tell you is in interviews, maybe you noticed myself doing this is pulling yourself back, right. So don't feel like you have to design every bell and whistle, right. Bring them up and say, hey, should I focus on this thing? It's really easy to lose all your time without focusing on the core components. So for you, I think I always had to pull you back a little bit, right? And like, hey, I want you to focus on this thing rather than that really cool idea you had. So try to work with your interviewer essentially to focus on the meat of the problem so you make sure you get it answered, because I'm very confident you can get it answered. But you may distract yourself basically with all the great ideas you have on the side. And that's just a time management thing. But anyways, yeah, I thought you did a fantastic job. Yeah. Thanks for spending time with me. Did you have any questions or does that all kind of make sense?
Hipster Samurai: That does make sense. I feel like if I just really go into depth on distributed systems management because I understand Sharding and I understand NoSQL and how to query them, but I really didn't go into how to manage it. And I wish I had thought of the scheduler memory because you're totally right. If you just create something in memory in a scheduler that's really fast to schedule all these jobs in memory, and then there's ways of making that durable, too. But I think what I could do is just today just create an outline of all the concepts I need to manage, like a NoSQL database and a postgres database. And then with zookeeper and sharding and partitioning and what happens with Shards goes down and leader election. I've learned all these concepts before, but I just have not put those particular concepts together yet. I think.
Deliberate Alligator: Yes, last thought. I'll leave you with and I apologize. I really got to go. I'm back to back here. No problem is we didn't talk about scaling the worker farm and keeping it up to date. So because you know all the schedules and things in advance, you know exactly how many workers to partition at a point in time, it's not immediate to spin up, like a virtualization instance or hardware. So there's other pieces of this design that you can kind of extend. I have no doubt that you would think of it, but the main point, I think, is ask your interviewer, like, hey, what should I spend my time on? Should I work on this thing? That thing. Right. It's a very bi directional conversation. Yeah. So don't feel bad about any of those things. Like I mentioned, it's just things to kind of elevate your responses. Right. I think you're in a fantastic spot. Yeah. So thanks for spending your time today with last question. It was a pleasure.
Hipster Samurai: Thank you. Would it be as a senior engineer, higher or no hire? Just curious.
Deliberate Alligator: Oh, absolute higher. Yeah, absolute higher for me. Without a question? Yeah. I think you're borderlining the staff range, so don't undersell yourself. I would love to hire you for my own team.
Hipster Samurai: Yeah.
Deliberate Alligator: I'll give you more of a write up.
Hipster Samurai: Wow. Cool. Okay. Yes.
Deliberate Alligator: Cool. I really have to run. I apologize. But yeah, I'll send you some more notes and things that I hope are helpful. Okay. And have a great week. Take care.
Hipster Samurai: Thank you. Bye.