Interview Transcript
Indelible Hawk: Cool. Just before we begin, are you familiar with the platform? Have you had any mocks in the past?
Secret Zebra: Yes. So I've done a few mock interviews. One of them was a systems design interview, and my background is mostly in tiny startups and founding. So these are some of my first interviews ever of this type. I've given interviews, but they were generally more targeted at whatever specific product we were making.
Indelible Hawk: And do you know what level you're targeting?
Secret Zebra: I'm targeting l five, based on what I've asked my friends and colleagues around.
Indelible Hawk: Excellent. Let's get started. So you said you've done once in design mock, so I won't talk too much about the platform, just that you have a bottom left blue button to enable the whiteboard if you need to draw anything. Otherwise we can discuss in the chat later and I leave at the end for feedback.
Secret Zebra: Awesome.
Indelible Hawk: For the interview problem, do you know what Coderpad is?
Secret Zebra: I believe so. It's kind of like an online scratch pad where you can type code and then it runs it in browser.
Indelible Hawk: Precisely. Well, it doesn't exactly run in browser, but yes, you can execute it. It's an online collaborative code editor. By collaborative we mean that text changes should be synced and code editor, we mean that code can be executed. Right now the tool we are using right now is basically coderpad. This company has different contracts with meta. For example, with interviewing IO, they provide this shared editor. Now we want to be able to use it for code interviews. So this is the goal we want to have in mind here. So you can assume in a typical session two to three participants, very rarely any kind of requirements like audio, video, whiteboard notes, they're out of scope. The only thing we're focusing on are the collaborative aspect and the execution aspect, with a final requirement that content should be retained for ten years. Final content.
Secret Zebra: Awesome. And you said with the code it's just the collaboration plus execution, right.
Indelible Hawk: Plus retention, yes. Those are the only three requirements. Yes.
Secret Zebra: Awesome. And with the execution, I assume like this interview software that we're using, all of this code is simple enough to run on single small machine, let's say. But I guess that's a question for me to figure out. We're writing less than 1000 line programs here. I'll get to that later. All right. Non functional. Okay, so let's go to functional here. You already wrote out a lot of them for me. I'm just going to copy and paste some of these. All right, so we want text changes, synced, code can be executed, content retained, and that's it. All right, cool. We're going to have somebody sending their typing to our system and they're going to be receiving some up to date version of the document. Is it in scope that we may want them to be able to see the cursors and the names of other people?
Indelible Hawk: No.
Secret Zebra: Does the editor, just the editor. Awesome. All right, typing. They receive an up to date version. I think those are the big functional requirements. Let's talk non functional. So obviously we need to be able to execute the code. Let's just say able to, this can be up here container enough resources to execute quickly. And then we're going to want the receiving of the up to date document to happen near real time because that will really affect the user experience. Here we are. It's going to want to be the near real time may be more important than it's being consistent. And what we definitely don't want is it to slow down too much because there's going to be a trade off there. But as long as it's eventually consistent within a reasonable amount of time, I think that's what's going to be important.
Indelible Hawk: That's fine.
Secret Zebra: Okay. And then let's see anything else here. I think is there a specified way to handle conflicts? Like if we end up typing in the same space or is that there.
Indelible Hawk: Is no specified way to handle them? I would love to explore a few options.
Secret Zebra: Okay, awesome. Let's just say handle conflicts gracefully. So before I design the cloud architecture here, I do want to think about these conflicts because that could end up being an important point. So let's see if we have two cursor positions. One thing we could just say is cursors can never overlap, but then if somebody's backspacing through text, that's not going to be great. I think what we can do if people are typing and their cursors overlap and they end up, this eventually consistent thing ends up missing a few characters or something at the most. I don't think that's going to be a huge user experience issue, not a huge UI issue if minor conflicts show. So I think what we can say is, honestly, we can just have something just like the last type that's in, last message in wins over it. I don't think that's going to change the UI that much and it's definitely worth going fast. All right, so I think this handles largely what would make this fun. I think I'm ready to dive into diagramming this up. What do you think?
Indelible Hawk: I'm happy with that.
Secret Zebra: Awesome. All right. Headed to the whiteboard now. Probably going to be using desktop for this, but they don't have a desktop icon. All right, so to make this go super fast at the expense of, first of all, let's keep track of everything here. So we're going to be sending keystrokes? That's a good question, actually. Do we want to send keystrokes or do we want to send a diff? Because we're going to have selections as well. Yeah.
Indelible Hawk: Sorry, what, what are the trade offs between keystrokes and diffs?
Secret Zebra: Yeah, so keystrokes themselves could be, depending on how we implement it could be faster because what we're talking about is like, say I delete a large section, I would need to keep track of what I have highlighted. So it wouldn't really be keystrokes. Yeah, a diff would be faster because essentially the only way to make keystrokes work very well is to turn it into a diff. So, yeah, I think a streaming diff would prevent us from having to keep track of mouse selection and keystrokes all at once. So honestly, I think they would both work fast enough, like on the same order. I think diffs will end up being way easier for us to actually implement and debug and maintain. Okay.
Indelible Hawk: Happy with that. At some point, I'll also look at this as a format of these diffs.
Secret Zebra: Yeah. And then what we're getting back is the latest doc. This could also be a diff. What I worry about there, though is we might have to have, like, keyframes. So let's say there's going to be a trade off here between whole doc and something like a diff plus keyframes. Because if we drop some frame of data here, we're not going to want it to like that diff to run off the rails. Um, and so we'd have to periodically send keyframes. All right, so then these are going to be when we first connect, a load balancer is going to connect us to a session of some sort. And so that can be a microservice. But if we wanted to run really fast, we could have these be sticky instances. Also, if we're looking at speed, let's run this all over websockets. And if we're looking at multiple instances here, we could do some kind of sticky session instance, almost like a game matchmaking service, where both, let's say this is user one and this is user two, where they are both connecting to the same instance here and on.
Indelible Hawk: When I say sticky, do you mean like a websocket, or what do you mean?
Secret Zebra: Oh, by sticky, I mean that we're not going to be switching. By sticky, I mean both, we're not going to be switching multiple instances won't be able to handle the workload in parallel. If our instance went down, for example, we would be losing some data in exchange for really fast speed and responsivity. Right.
Indelible Hawk: This connection between the server and the client, how are the clients and the server connected? Is it a websocket or is it more like a CP polling?
Secret Zebra: Oh, it's websockets. Between the load balancer, sorry, websockets. And then the load balancer routes that to an individual instance. Okay. Yeah, I would say there's one way to design this, where those instances just pipe those websockets into a message queue, but that will add a little bit of, and then we have microservices processing that message queue and sending it back. But that will add some latency. Whereas I think with instances here with, let's just say with a redis database on each one could run like lightning fast. I'm going to draw, we have this multi instance here. This is single, this is one of, one of n instances, and we are connecting both of these to, everyone that's on the same coder pad would connect to the same instance and their session would be sticky on that instance. So that's what I mean is they're not switching between these multi instances.
Indelible Hawk: Right. What I'm asking is how do we guarantee that everyone on the same code app connects to the same instance?
Secret Zebra: Yeah, great question. So I think what we would do is keep a registry of instances that does not need to be lightning fast. So there would be, when somebody is loading it, let's just say the actual UI polling of all the instances is out of scope and we just have a URL reference. Would that be okay to start there with like a URL reference to a particular.
Indelible Hawk: That'S fine, that's fine. You can imagine. Let's take the example that we have right now just for a second demonstration.
Secret Zebra: Exactly. So then in this RDB here, we would have some schema that is like the session id, or let's just call it the pad id, actually the pad id. And that is going to point to our session instance, which the session instance is one of these. And so then when I connect, what I will first go through is actually a matchmaker whiteboard. Sometimes okay, matchmaker, which is just going to query this DB and then it's going to. These matchmakers could scale horizontally as well. I'm just going to use one of these. So it'll query that and then direct the websocket traffic to and from the correct multi instance server.
Indelible Hawk: There we go.
Secret Zebra: Yes. And so then let's see. I'm just running through some of our requirements again. Don't want to miss anything. All right, so we're going to sync the text changes. So we're going to have diffs that go in. We're going to have, I deleted these. Still there. Stuff popped back up. Okay. The code can be executed. All right. So let's get into that. Sound good?
Indelible Hawk: Sounds good. But I also want to discuss a bit of diff format.
Secret Zebra: Okay, yeah, sure. Let's look at the diffs format first. So I'll be honest, I've never designed a diff system before, but I think what I would first look up is what some open source tools that already exist use. So I assume that something like git diff does something really.
Indelible Hawk: Let me reiterate the question. I'm not looking for that kind of diff format. Just fundamentally, if something changes between two files, how can we encode that change? Oh, I see, unix diff. Yeah.
Secret Zebra: Someone will be. Let me just blow this up. Someone is typing client. Actually we do it. Yeah, I think it's fine to trust the client in this case because they're the ones typing. Client runs. When someone is typing, client runs diff on the document. In this case, the code that they're writing, and then send stream of diffs. This is just going to be, each one is going to be a diff from their current version over websockets. And this does not need keyframes. The reason being is it is going to be updated from the server side authority. So these diffs will always be on recently, not 100% up to date, but recently up to date code.
Indelible Hawk: Understood. And how often will we send to these diffs?
Secret Zebra: That's a good question. We're talking pretty small data here. We're talking less than, let's just say to be really safe, less than like 10 kb. Trying to think if there is anything else. Oh, by the way, we're not doing, yeah, you said voice is out of scope. We're just doing the pad. I think we're sending less than 10 data, even if they copy and paste the whole thing. So I think we could honestly, I think we could try to send a diff every keystroke because we're sending it, the fastest we would be sending it is how fast someone could type. So if we assume like 200 words per minute, something like Max, let's say way overestimating it, 100 characters per second. So that's ten milliseconds per character. And those diffs would be tiny. Those dips would be like less than, probably less than ten bytes. So I think we can do that while someone's typing. That's only like a kilobyte a second. And what's great about doing that is right when I hit a character, someone else is going to see it, which again, I think that snappy feedback is what's really going to have a wow factor here for end users. We could certainly do it slower, though. And then the trade off there would be, we would probably be using a little less bandwidth, which would be nice for our costs and the end user's Internet costs potentially. But at the cost of that responsivity, 100 milliseconds would probably be roughly, well, let's even say the refresh rate of the monitor. Right? So we could cap this at something like cap at 30 milliseconds because nobody's going to be able to see anything faster than that anyway. So yeah, last character, or right when a character is typed, but cap it at 30 milliseconds. So someone's typing may have multiple characters within that 30 milliseconds. It'll queue those and then send them all out at once. Send one bit out at once. Does that sound good?
Indelible Hawk: Yeah, I'm happy with that. How does the multi instance or a given instance process these updates? So let's say they get maybe ten updates per second or something, right? 100 bits per second. How do we deal with that server side? What do we do?
Secret Zebra: Yeah, so we're going to do here, these redis DBS, by the way, are actually on these little session instances, so they're happening super fast. And so what we're doing is we have a, honestly, the redis may actually be the archive that we're having of this. Is it an archive of the history? Does it include the whole interview?
Indelible Hawk: Is this like a snapshot of the final content?
Secret Zebra: The final. Okay, cool, then actually, I don't even think we really need, if this is just going to be an in memory database. We really have one document, so we can just store it in memory. So each of these instances, what they'll do is they'll have diffs that are coming at them. And these could be, these could be diff of user one. And the stream is going to be mixes diff of user one, diff user two, so on and so on. And the whiteboard is freaking out. There we go. And then on the inside of this server we'll have a queue of these diffs where we take, one by one, take diffs in order from Q merge into latest document, which will be in memory, and then let's say every 30 milliseconds. We already established that as a pretty good time. Send or broadcast latest doc, latest broadcast diff from last 30 milliseconds to all clients. Does that make sense?
Indelible Hawk: Partially. You are taking this in order from the current merging Dom. I assume by in order you mean in the order they arrived.
Secret Zebra: In the order they arrived. But what we would use is it won't be a dumb merge, we'll need to use some kind of smart merge. And the diffs and merge algorithms aren't something I'm super familiar with, but again, I'd probably look at something like how does get merging?
Indelible Hawk: Is not my main question here. My question is, if you are taking them in the order they arrive, and you are going for such a low latency, like ten to 100 milliseconds, then it's very possible, it's very likely some of them will arrive out of order. So times 22 will arrive before times 21. How do we deal with that?
Secret Zebra: So if they were all arriving in order, that would make our life a lot easier, because the merge could not be very smart. But as long as we use a merge that is comfortable getting diffs out of order again, I'm pretty sure with git merge one branch can get ahead of another branch, and then you can use some smart merging algorithms there. I think that would handle it. If I had to write my own of what that might look like, I would probably m the diffs will arrive out of order. I mean, you could timestamp the diffs diffs, and then you could ignore diffs that were sent earlier than latest. And so then, you know, we won't be going backwards in time ever with those diffs. The trade off there though is we do run some risk of losing a little bit if two people are typing at once, but as long as they're typing in different areas. That's a good point, actually. We could timestamp the diffs, and then we could ignore diffs that happened earlier than Latif if they conflict.
Indelible Hawk: Normally I would agree with that. But if you're using a diff format, then say I want to type the word I don't know. Word.
Secret Zebra: Right.
Indelible Hawk: And let's say we update every keystroke. Then the update will say something like insert letter w, insert letter o, insert letter r, insert letter d. Right. One of those rise out of order and that gets ignored. Then we'll end up with a missing letter if we just.
Secret Zebra: Yeah, but I think it would only get ignored if you were editing the same cursor positions as the other user because that's where there would be a conflict. Right. If you were typing in different areas.
Indelible Hawk: Let me clarify what I mean here.
Secret Zebra: Okay.
Indelible Hawk: There's no other user typing. It's just me typing. Right.
Secret Zebra: Okay.
Indelible Hawk: And we have four changes. Number one. Yes, sir. I don't know why I can't write it down. It's kicking. Never mind. I'll just talk it out. So first update will be something like insert letter w. Second update will be insert letter o. Now, if insert letter w arrives after insert letter o, then that update will get ignored because it has an older timestamp.
Secret Zebra: Right.
Indelible Hawk: Now we'll have insert o, insert r, insert d. So we'll end up with Ord. But what happens with the letter w? I just typed.
Secret Zebra: Right, I see what you mean. Now, one idea we could use for this, we could reconsider the format here and move away from diffs to something else. But I think one way we might be able to manage it if we are using diffs is to create like a change list here that contains some cache of the latest changes. And so this is, let's say this is ord, right. At any given time, this is our latest state. Right. And then when we receive the w, it's going to have a later timestamp. I'm just going to use these equals two, three equals four. Those are coming in. But we missed the w. Now we receive the w. And so this comes in with timestamp of one on it. And what we would do then is we could insert this here and then insert and rerun diff chain. Does that make sense? That's one way we could do it. I don't know. That calculation may add some lag, which is what I worry about. Yeah, cool.
Indelible Hawk: I don't want to talk too much with this, so let's move on to code execution.
Secret Zebra: Awesome. Okay, so the execution, what I'm imagining this, we could have just a microservice. And so what I'm imagining here is we have, let's just say exec microservice. And honestly, this could have its own little separate load balancer as well because we might be running on any given machine, we might be running multiple pods of code execution. So this multi instance here will have the latest doc. And so if somebody sends, and I think for speed, we already have word sockets open. So let's say there's a special signal here that's like exec. So I'm going to say diffs and exec are getting sent here. And then when the instance with the doc in memory receives that exec call, it is going to, no, I don't want doted line. It's going to send a request to this load balancer with the latest code doc here. And then that load balancer is going to funnel that to one of these microservices just to keep these from building up what we could actually do instead of this being some kind of direct request, I think it would make sense for this to be a message queue instead where we're just sending, whoa, not going to change that. Where we're just sending these jobs, right. Latest code doc job that these microservices are all just pulling from. And then similarly on the way back we're going to have the executed code return or console printout. And so the multi instance would then be waiting for a message that it receives. After that job is complete on these machines, what they would be doing is they receive a document, they would start, let's say I worry about security running it. I'm going to say we start a container because that job will include the language. And so I'm going to say we start up a container that is configured to execute code in desired language. And then with doc as input or parameter, and then container console output would be streamed to, well, final container, final console output, put in message queue, outgoing message queue, message queue. Let's see. Actually, one question. Will the console need interactivity? Like will it need me to be able to say whatever? Will I be able to say like console input or something and it wait for me to type, basically just typing.
Indelible Hawk: Code in a given language and then pressing run and getting the output. Awesome.
Secret Zebra: Okay, that's great. Thank you. All right. Yeah, and so these could spin up or whatever balance between any number of these. So this scales horizontally really well. I worry a little bit about security, I was going to say I worry a little bit about security with containers as opposed to like a vm. So I just want to look into that. I think as long as it's contained on an individual machine though, the worst thing that's going to happen is someone can hack and see some random other paths code.
Indelible Hawk: So what will happen if, for example, someone has a suboptimal solution and it goes in a while loop. An infinite while loop.
Secret Zebra: Oh yeah, we should have a timeout. There's actually a few parameters of this container. So the container will need to have a timeout, in which case we kill and return console text up to a certain length, let's say up to like 5 console text, just in case they're printing like an essay on the console.
Indelible Hawk: Sorry. What will happen if they try to use an unreasonable amount of memory, let's say in c plus plus.
Secret Zebra: Oh, that's exactly what I was going to type, Nick. Actually we would specify container memory. I forget what you call it, limits. Well, what's going to happen if it hits the memory and it can't go anymore is it's going to crash, in which case it's kind of the same as if it errors out. So we'll crash if it hits. So we could do two things. We could just return the error console log, but we may want to be even nicer and say if it hit resource limit, return more detail, return more whatever, user friendly error, like ran out of ram or ran out of disk, something like that.
Indelible Hawk: Happy with that?
Secret Zebra: Cool.
Indelible Hawk: So just one last question. In the overall system, where do you think the button could be?
Secret Zebra: I think right now, let's see, we only have one to three users on each pad. The message queues could be horizontally scaled because we're just pulling out of them from the microservices. So I suppose the bottleneck would be this relational DB here. And I'm trying to think if we actually need the immediate consistency there, we may not. If this is the bottleneck right now, I believe may be able to move to something that's eventually consistent. And then I haven't shown the archiving yet, but I think that will also not be, I don't think that will be a bottleneck because I think that'll be a NoSQL document share that doesn't need to be acid, it can just be eventually. But it should be durable though.
Indelible Hawk: So what will happen if, let's say we are at peak time, right? Obviously, as you pointed out, both systems, the execution and the editing, are horizontally scalable. We can add more machines, right? But what will happen if we have a limited budget, right? We don't want to add more machines and we're at peak load. And for some reason these users are very kind of testy and they want to execute lots of code. Maybe it's peak hiring season and everyone executes code, right? And we don't want to the execution side, what implication does it have for editing?
Secret Zebra: I'm sorry, I think I missed some of it because it cut out. We don't have enough money, so we don't want to scale which part of this.
Indelible Hawk: So let's say we don't particularly care about executing code anymore, right? We have a limited budget there for those much. Got it.
Secret Zebra: Okay, so we're not executing code.
Indelible Hawk: What implications would that have for editing if we are already at peak capacity for execution?
Secret Zebra: Okay, so if we're not executing code, what effect might that have on editing?
Indelible Hawk: Is that the, precisely.
Secret Zebra: We could probably do some peer to peer connections.
Indelible Hawk: Let me reiterate the question. In this system, oh, sorry, the current design system, we are slowly increasing the load. We have more users and more users, right. At some point users start compiling a lot more, right? And we can't handle all that compilation need out there, right?
Secret Zebra: Yeah, users start compiling. Oh, we could compile locally, potentially. Sorry, that doesn't sound like.
Indelible Hawk: What I'm getting to is given that we can't compile anymore because we don't want to allocate any more resources server side and we are already at peak.
Secret Zebra: Oh, like should we send them an error message or something? Is that kind of what?
Indelible Hawk: That's the first question. Let's assume we send them an error message. Or let's assume we have a workaround, like a local compilation. We'll go over that in a minute. But if we don't have any more machines, how can we ensure that the editing service is still scalable? Because I assume everything is going through a single load balancer, from what I see here.
Secret Zebra: Well, so everything's going through a single load balancer. But I think, let me just draw a line. If we're running out of machines to execute code, this will fill up. This queue will fill up. If no execution resources, we could be really not nice and just drop queues, drop messages, and then the instance that interfaces with it could just be like it wouldn't have a nice error message there. But this queue is what would build up here if we weren't able to execute those jobs quickly enough so we could have a timeout. Sorry. To your original question, I don't think the editing functionality itself would actually suffer at all in this design. If the execution microservices got overloaded, someone would click run and they would just never get back an answer to their execution. But the editing should still keep working fine. Cool.
Indelible Hawk: Happy with that. Let's go over some feedback here. I will just paste it in the shared editor. But before that, how did you find the problem? You find this, you find it difficult?
Secret Zebra: I loved it actually, because it covered. I'll be honest, one of my faults is that I've only used message queues in systems like once. It's not intuitive to me to be like, hey, we should use a message queue here. But this problem really got me to think that way because of these code jobs. So I liked that. Love microservices and definitely they're my go to for things. But I also love places where microservices that aren't fully, I forget what you call them, fully partitionable are still really useful. And this is one of those, right? Having actually having one instance manage two streaming users controls could make this really responsive. So yeah, it's an exciting question.
Indelible Hawk: Awesome. Yeah, it's a pretty difficult question as well. Right? Because it has several aspects to it. It has a collaborative aspect from what I hear, questions like design Google Docs are particularly popular. And it has questions like the code execution aspect, both of them together makes it quite a tricky problem. I will just the feedback online test it to and we'll go over it line by line. Can you see it? Yeah.
Secret Zebra: Wait 1 second. I always forget where the toggle whiteboard button is. There we go. Okay.
Indelible Hawk: Right, so it's line 32. Plus is basically mean. Like that was, that went well, right? Minus means something was missing. There was a gap, and if there is an equal sign, which there isn't in this case, it's just like a piece of advice or just some thought. Cool. Started out really well rated, functional requirements you anticipated, and you also discussed how we would handle conflict resolution from the very beginning. You selected last right wins as a reasonable strategy. I was sort of looking for a bit more trigger discussion between different options. I think the summary there was, hey, we'll pick this because we don't think it will be a big deal if there's a small kind of conflict there and we just pick the last right wins. Which is reasonable. But I was sort of looking for other options as well. Right. For example, I don't know, locking the document or something to read on is operational transforms. Because this applies fine here because there are two people on in the session and one of them is the interviewer, who will likely not write the same time as the interviewee, which is fine. But if you get a problem like design Google Docs, you can expect like a book that's like 1000 pages collaborated on by maybe 100 people simultaneously. So in that case, some stronger way to solve these conflicts.
Secret Zebra: That makes sense.
Indelible Hawk: Yes. Basically at l five and above, one of the key traits in these interviews is you want to discuss lots of trade offs, right? And indeed some type of discussion. So you discuss trade offs, for example line 38 when updating if you want to update on keystrokes or if you want to update a whole diff. And you discuss a bit of updates on the storage as well. You discuss a bit, sorry, trade offs on the storage as well. We discuss some more trade offs later, but even more.
Secret Zebra: Right.
Indelible Hawk: Whenever you can think of different ways to do things, point them out.
Secret Zebra: Yeah.
Indelible Hawk: What I think was a very big miss on the initial part of the interview are these three lines, 35 and 37, particularly the scale. We don't know what scale we are operating at, right? So if I say, hey, why did we design this complex system? It's just like ten users in a coding competition or something, right? Why do we do this? Why can't we use just like a simple doc, right?
Secret Zebra: Yeah.
Indelible Hawk: On the other hand, maybe we have a system that's like a billion users. And then I would ask you, why don't you account for data centers on multiple continents? So super important probably if you only have like 10 seconds for requirement exploration, figure out a scale that's the most important part. The initial part, the way went over functional and non functional requirements. It was very methodical, right? It will be appreciated in every interview. But that comes at a cost. You are spending maybe five minutes on that. You are not really getting information. I don't think you asked me any question for functional, non functional requirements. Maybe just the conflicts, right? And everything else was like a monologue, meaning you would have been able to just say them upfront quickly and that's it. There are some candidates who do this kind of five minute sort of discussion on availability versus consistency. And it always ends up we anti availability venture consistency. Right. There are exceptions. There are some problems where you are fine with some loss of availability for strong consistency, but those are exceptions.
Secret Zebra: That would be like a bank transaction or something, right?
Indelible Hawk: Precisely. Or if you know you want to end up with that, just, it's upfront, right? 10 seconds. You could have just added, for example, hey, we have the function requirements above. Let's move on, right. Less methodical, but save you some time for more discussion later. But they're just an advice. Okay, but if you want to make the most out of the time for recurrent exploration, figure out the scale. That's the most important thing to figure out.
Secret Zebra: That makes sense, right?
Indelible Hawk: Because you might not even need the load balancer. Right. For that. Or on the country, you might have connections that you need some server just to handle connections.
Secret Zebra: Yeah. You need like a regional replication, some kind of cool.
Indelible Hawk: And the same with latency. Right. You are focusing a lot in the middle part of the interview on having a lightning fast service.
Secret Zebra: Right.
Indelible Hawk: But what exactly does low latency mean? Sure we want to have low latency, that's kind of obvious. But the number on that, is it like three to four milliseconds? Probably not. Right. And it feels more like that's what you were designing for. Like a very kind of low latency, almost like video editing sort of low latency. But in this case, if I'm typing stuff on line 36, you probably can't spot a latency of 200 milliseconds versus ten milliseconds or even 300 is very kind of easy to account for. You can even go as high as 1 second, right? Yeah. As long as the whole word arises in 1 second. It's not that big of a deal. Right. It's not that bad of an experience. So with the 1 second, let's say 500 milliseconds latency in mind, you can be a lot more relaxed here. Right.
Secret Zebra: That makes sense.
Indelible Hawk: So essentially things you want to figure out at the beginning, some important questions to ask. Are things like scale latency, user distribution, potentially. Right. If it's global, if it's local. In this case, I would have said, if you asked, that it's mostly us based users. So that basically your design worked anyway. But if it's a global system, then into account for what you mentioned, regional replication.
Secret Zebra: Yep.
Indelible Hawk: Right. And there are other things there. Security, for example. Hey, can you assume everyone is well mannered or should we have an adversarial model in mind where we might have some hackers, for example, security concern, any kind of budget constraints that we have, a small university versus a large company designing this. So these are usually important questions to ask and we want to clarify them upfront. I will just say, buzer, potentially important with the question mark.
Secret Zebra: Can't hurt. Sorry, can't hurt to ask. They might as well just say it doesn't matter.
Indelible Hawk: Yeah, exactly. What else? Some other things that are relevant are potentially the user. Not the distribution necessarily, but more like the query distribution or the activity distribution. Do we expect it around the clock or will all these requests come within very narrow kind of time span and that kind of user distribution. But it could be something that needs to happen 24/7 cool. Moving on to the feedback. I'm on line 40. So typically what we want to see in these interviews are some back of the envelope estimates. Are you familiar with the concept?
Secret Zebra: Sorry, can you say that again?
Indelible Hawk: Cut off.
Secret Zebra: Sure.
Indelible Hawk: What we want to see in these interviews are some back of the envelope estimates. Are you familiar?
Secret Zebra: Yes, I'm familiar with that.
Indelible Hawk: Cool. Because it is some of that later on when you were trying to figure out, okay, how much bandwidth are we using? Right. Okay. We have about, I don't know, 1 kb in document. We update every ten milliseconds. So we should like 1 kb/second or 10 kb/second or whatever. So that's the kind of estimate that we want to see here. Now some people just did some training, did some kind of mock interviews. They know they have to talk about back of the envelope estimates at the beginning and they just do some pointless estimates. I would say it's counterproductive because you are kind of wasting time and not using that number for anything useful. Right? Maybe, I don't know. They're estimating what would be a good example. They are estimated how much total bandwidth brings for our servers.
Secret Zebra: Sure.
Indelible Hawk: You come up with a number, say it's 1gb/second what do you do with that? Right?
Secret Zebra: Yeah.
Indelible Hawk: But there are some things you could have come up here that would have been useful to inform your high level design if you ask about scale. So things be the number of active connections you need to maintain because servers have a limited number of ports. So you might need a lot more servers just because you need the ports. Another thing could be the timeout. So how soon you have to time out? Because if you allow each execution to run for maybe 10 seconds, you might have a lot of concurrent processes going on that each runs for 10 seconds. Another thing could be the total qps, because if you update every keystrokes and there are maybe 1000 keystrokes per session. Right. The number of given information would be 1 million interviews. So you're talking about 1 billion qps per day. Sorry, 1 billion queries per day, which is non negligible right now. The one is storage. We discuss retention for ten years. Right? We never covered that. I believe in the design, it would have been a relatively straightforward one. I'm not going to weigh down too much. It's just like having a product to make sure it purges a table after ten years. But the question is, without doing any estimates, right, just guess, how much storage do we think we need for all the content for ten years?
Secret Zebra: Right? Yes. What does that make sense?
Indelible Hawk: So without trying to estimate it, like, intuitively, how high?
Secret Zebra: You're asking me? Sorry.
Indelible Hawk: Let's see.
Secret Zebra: It drastically depends on how many people are using it, but let's just say.
Indelible Hawk: 1 million interviews per day. But don't try to just send me an intuitive number without, like, running.
Secret Zebra: Yeah, let's say maximum. I think like a terabyte even. Yeah, it's probably so we could be safe and do 100 terabytes or something cool.
Indelible Hawk: Yeah. If you do the estimate, it's probably around 36 terabytes. So you're quite close with another magnitude, and you'd be safe with 100 terabytes. But conceivably right. You want to run an estimate here to be safe. Some people estimate a number and they're like, it's trivial, but others don't, and they overdesign the system. They're like, okay, we need call storage for this. And we know this kind of fancy layered storage, but you don't need that. You can typical DB, and it can store just fine, 36 terabytes. Even a laptop can store that.
Secret Zebra: That makes sense.
Indelible Hawk: Just moving on. You discussed trade offs, routing keystrokes of this. You elected to use websockets to connect to instances routed by a load balancer, which was nice. Was looking for a bit more discussion here, a trade off discussion against other approaches, such as polling, if you think about it, and if you ask about the scale, as I said, 1 million. So it might not be a fancy engineering solution, but it's perfectly reasonable to just pull every single second for the whole content of the document. But once it's like 10 kb, right, 10, not a kind of big deal. It's relatively reasonable. You can do probably 1gb/second in some extreme instances, but 10 nothing.
Secret Zebra: Yeah, that makes sense. And a lot of the websockets may fall back to pulling anyway.
Indelible Hawk: Exactly. So you simplify your system a lot, right? You're like, I'm just pulling the server for the most recent instance they have and telling them my latest change. Right.
Secret Zebra: Yeah.
Indelible Hawk: So it will simplify system a lot. And that's maybe a piece of advice I have for you here. Don't always assume that we want the most kind of complex or challenging solution. Sometimes the simple could work the best.
Secret Zebra: That makes sense.
Indelible Hawk: There is this operational transform, a point on 937. It's an algorithm for how to solve this kind of collaborative problems. It's relatively simple to explain, it's harder to implement, but what it does is if it gets a series of changes. So let's say you have change one, change two, change three and they are applied on a document d. What it does is that after it applies change one, then it recomputes change two using the new image of D. So for example, you added line 40 to do some stuff, and I entered the new line. Then we know line 40 becomes line 41. So it will just do all your changes on line 41. Right. It's reasonably intuitive to explain, but it's hard to implement and I don't expect people to be able to implement it. But a lot of people, they assume we want operational transforms because we know about that. They try to explain how it works, they try to implement that, and I'm like, fine, I'm giving you enough rope to basically try it. But why? We can just log the whole document, or we can pick an approach like describe last right wins and we are just as fine.
Secret Zebra: Yep, that makes sense.
Indelible Hawk: Cool. No API discussion. So typically I want to see some API discussion here as well. We discuss it maybe informally as in hey, we send this data, we get back this data. Like hey, we send a request to execute the code and we get back the results of the compilation. I will go for more of a formal API. Right. So what do we send? Do we send the user id? Do we send some kind of authentication token? Do we send an API version? And what, the format that we get is like a JSON is some sort of binary data. Does a client decode it? Not a big deal for this problem, but other problems might be more API heavy.
Secret Zebra: Yeah, yeah. But even with this one, right? I mean, even if it's all over websockets, you'd still at least maybe want to be sending something that's like your credential with.
Indelible Hawk: Not just for authentication, but also discuss how the data is structured, right? Yeah, because then we were discussing the diff format. You write, I say there maybe, and then later we figure out, okay, maybe you want a timestamp as well. So it's very easy to add that change.
Secret Zebra: Yeah, cool.
Indelible Hawk: You of the envelope estimates for the bandwidth, you discuss trade offs against lower updates. I had to point out that we can't guarantee that updates arrive in order.
Secret Zebra: Right?
Indelible Hawk: I'm expecting you to identify that, especially over websockets, over UDP, they won't arrive in order. You propose to merge diffs in the order they arrive, but again, they don't necessarily arrive in order. And eventually you propose a timestamp. Right. What I was trying to get you to do there is try to take you away from assuming this is a fancy diff merge algorithm you are kind of imagining this is something like git or like version control. Those algorithms would typically work here, but they would be too slow.
Secret Zebra: Got it.
Indelible Hawk: Because if you think about, say, I don't know, when you run a git commit for example, or a git amend or git add, it's not instant, right? And you can even see it humanly that sometimes when you have like ten files even, it takes maybe a second. Or when you do a git, maybe it does take much more after pulling the changes locally by the network to just merge them in. Yeah, right. I was expecting something really simple here. Doesn't need to be any kind of complex algorithm, something as simple as hey, we take the start position and the start line and we have the end position and the end line, and then we have the content, right? And that's it. And that contains all the information we need. We could have some other information like annotations, credentials and so, but it contains all the information we need for any changes, for inserts, for deletions, for edits, for selecting something, for pasting something.
Secret Zebra: Yeah, great.
Indelible Hawk: And then what the server does is literally go to a text file in those positions and insert the content or update the content. Cool. Yeah, at some point you propose using a timestamp to avoid out of order divs, but I had to point out that if you just ignore the older diffs, it will result in something that the users don't expect, like making a letter, for example. And if I keep a buffer of changes, which would work, but as you correctly pointed out, it will result in some potentially lag there. What you could do is apply the updates anyway, tell the user, but then still keep the buffer. And if you get an old update, then you can apply it and update again.
Secret Zebra: Yeah, definitely. It's like you backtrack. Exactly. Not in the programming way, but.
Indelible Hawk: Initially for the execution part, you never consider timeout and memory limits. I pointed out a bit and then you went a lot further there. You discuss timeout, discuss memory limits, you point out security concerns, discuss sandboxing dms for the question I asked you, hey, how does this affect basically the editing? You had a very reasonable question there that we'll just use the queue when it fills up, then we stop the updates. Some designs. What happens is that candidates go to the execution service straight from a main messaging queue or from the main load balancer. And when that happens, what was the implication there is that when the execution service fills up, we stop allowing collaborative edits. And that's better.
Secret Zebra: I see.
Indelible Hawk: But in your design, you go to a separate message queue, then you can just have a limit on that queue.
Secret Zebra: Yeah.
Indelible Hawk: It's not ideal though, because let's say you put a limit on that, like 100, right? Sometimes 100 will work just fine, but sometimes people will have just incredibly inefficient code maybe, or like a mistake in code. And all the machines are busy for the full timeout interval. Other times they have very efficient code. It runs in like ten milliseconds and then they stay idle because we limited the queue. Yeah.
Secret Zebra: So you could space them out.
Indelible Hawk: Limiting the queue is fine. You probably want to do that, but probably with some way higher limit. But then have another load balancer.
Secret Zebra: Yeah.
Indelible Hawk: Cool. I noticed you're in the discussion well, so that's what I was looking for at l five and above, more general candidates. What you'll see them struggle with is find the next topic of discussion. They talk about something and they'd be like, should I talk about anything else? Are you happy with that?
Secret Zebra: Yeah.
Indelible Hawk: You identify the topics also, good written communication, clear drawing, easy to follow along. I always was up to speed with where you were. So I like that. If this would be a real interview, it would probably be on the fence between hire and no hire. I'd probably go with a no hire mainly because of the scale reasons. Right. If you're also designing a system at scale, as a synony engineer, I would expect you to ask, okay, what scale are we designing for? But there were other things. The loan wouldn't be a failure itself, but there were other things. For example, the API missing some trade offs that were never discussed. Had to point out a few small things. In principle, though, I would say that the scale was amazing. So if you asked about scale and latency, it would probably be higher.
Secret Zebra: That makes sense. Yeah. Clarifying designing for correct scale latency. If I had to choose three, it'd probably be and then trade offs.
Indelible Hawk: You think it will be more trade off discussion, especially on the important decisions. Like for example websocket versus polling and API discussion, having like a formal API in place, because we are mostly focusing on the server side right here. So in the real world you'd have probably a client side team that would be doing this polling, would be doing maybe, I don't know, the UI, the syntax highlighting. So we need to build an API first of all before even designing the system.
Secret Zebra: Yep, that makes sense.
Indelible Hawk: Yeah.
Secret Zebra: And the API, I guess would also help inform some things that I might miss on system side.
Indelible Hawk: Maybe when you design the API, you figure out, hey, you know what, we actually don't need the webs. We are fine with polling. Or you know what? We want a timestamp. Where do we store the timestamp?
Secret Zebra: That makes sense. That makes sense. With the API discussion. Would that be a good first thing to do when I move to the whiteboard is be like, here's the client, here's the API.
Indelible Hawk: It depends. I recommend doing it first, even before going to whiteboard and discuss it in this coderpad right in the shared editor. But some people approach in different ways. Some people have a high level design first and then propose a contract based on that. I recommend doing it first of all, because that's how you would do it in the real world. Your interviewer might be a front end engineer in the end.
Secret Zebra: Right?
Indelible Hawk: So you want to have an API that you both agree on, you and the interviewer, before you design the high level design. Otherwise maybe you miss something. Right? Maybe we don't want to get the compilation result. We want to get something like, I don't know, standard error as well.
Secret Zebra: Yes.
Indelible Hawk: Having that kind of heads to the requirements as well.
Secret Zebra: Yeah. Cool. Awesome. All right, well, I'll definitely kind of review all of the feedback and then those three things I'll focus on.
Indelible Hawk: Excellent. Awesome. So good luck with your prep and good luck with your real interview then.
Secret Zebra: Awesome. Thank you so much.
Indelible Hawk: Pleasure. Take care. Bye.