We helped write the sequel to "Cracking the Coding Interview". Read 9 chapters for free

An Interview with a FAANG engineer

Watch someone solve the design robinhood problem in an interview with a FAANG engineer and see the feedback their interviewer left them. Explore this problem and others in our library of interview replays.

Interview Summary

Problem type

Design Robinhood

Interview question

For financial trading, design a platform where a client gives you trades to execute and your platform executes trades based on available buy/sell prices from the market.

Interview Feedback

Feedback about Nefarious Gargoyle (the interviewee)

Advance this person to the next round?
Thumbs upYes
How were their technical skills?
3/4
How was their problem solving ability?
4/4
What about their communication ability?
3/4
Thank you for the interview today and great to speak with you. Today we focused on a system design round. For E5 I'd "lean toward yes/ 'yes' " since most signals were positively presented although there was some opportunity to dive deeper on topics like sql vs no sql and fault tolerance. Since I was probing mostly for breadth (not necc depth), voting "yes" here on the IO site. However, I've gotten some feedback that I can be too lenient sometimes :-) Strengths: - Seems great to work with -- Receptive to hints and open to dialogue while remaining proactive and persistent in offering solutions - Framing the problem -- excellent points on your intuition/experience with the setup -- E.g. Robinhood vs IB - Good back of envelope calculations in beginning e.g. on high-frequency users - Breadth -- Covered many crucial topics ranging from a mostly complete api spec + diagram of Server, Database, and Message queue - Tradeoffs illustrations -- Notification service vs websockets Potential areas of improvement - Time management: -- API spec(s): recommend to make simplifying assumptions where necc (with interviewer buy-in) to develop the main api -- Once the API is settled, then you'll have more time for building blocks of diagram + more time for deep dives -- nit: sometimes jumped back and forth between topics - Load balancer -- Add discussion on percentiles like P90, P95 to inform how your load balancer may interact with a fixed or variable number of nodes in Server -- Today I mostly probed breadth but may be good to keep in mind where you can further "show off" on specific topics when interviewer asks - NoSQL emphasis (e.g. Cassandra/MongoDB) -- nit: round your db knowledge by asking if SQL might be sufficient; illustrate what the primary keys might be (goes back to API spec and data contracts) - nit: latency vs response time can be differentiated Further suggesions - IO "Showcase" interviews -- e.g. the "food delivery app" questions - DDIA: Chapters 1 + 2, e.g. fault tolerance

Feedback about Stochastic Panda (the interviewer)

Would you want to work with this person?
Thumbs upYes
How excited would you be to work with them?
4/4
How good were the questions?
4/4
How helpful was your interviewer in guiding you to the solution(s)?
4/4

Interview Transcript

Stochastic Panda: All right, so what I will then lay out is brief intro. So that'll be total of 5 minutes. Then we will just do any kind of like goals, level setting very quickly, maybe 2 minutes. And then we'll do the main system design round which will be basically 35 to 45 minutes. And then feedback, initial feedback, I should say. And basic recap. How does that sound?
Nefarious Gargoyle: That sounds great.
Stochastic Panda: Fantastic. Fantastic. Okay, so for intros, let's do one or 2 minutes each. Of course this is an anonymous platform, so feel free to disclose as much as you're comfortable with. Would you like to get started or shall I get us started?
Nefarious Gargoyle: You can go ahead.
Stochastic Panda: Okay, great. Yeah. So my name is Ivan and I've got about nine years of experience, mostly in machine learning, machine learning, engineer and in big tech. I've worked at Amazon for a few years and currently I've been at Microsoft now for a few years. So a bit of experience in large tech companies and what else? Yeah, I've been on this site now for some time, both as interviewer and interviewee, mostly in the last months as a dedicated coach. So that's me. Yeah. How about you?
Nefarious Gargoyle: Yeah, I have been a senior software engineer, well, mostly on the full stack experience for a couple of years now. I think it's been like five to six years. Before that I was doing more mobile stuff. But then when I came to work in the US, actually, I started working for Amazon first, and then after that I went to meta. And then I found out that being a generalist is usually the way to go in the big tech company rather than being siloed into the mobile space. And right now I'm actually in a startup. But seeing now the job market seems to be like picking up for the big tech again. I'm planning to try to go back to the big tech company and yeah, I think system design is probably one of the trickiest part of the interview compared to the data structure algorithms. So here I am interviewing Dio.
Stochastic Panda: Perfect. All right, and thank you for that nice intro there. Sounds good. And I am watching the time. We're doing quite well. And in terms of goals, level setting, you mentioned some hints in your intro. Would it be fair to say that we're targeting maybe like an l six or l seven, let's say relative to Amazon?
Nefarious Gargoyle: Yeah, I think in meta it's like e five. So I'm not sure if it's like l six maybe yet, but I think it's around there.
Stochastic Panda: Okay, so let's say that I think Amazon and meta levels are slightly different, but I'll say or e five meta. Okay, great. So that is helpful. Fantastic. So I do have a system design question ready and.
Nefarious Gargoyle: Nice.
Stochastic Panda: One thing that I'll actually note before we get started is two things. So one is if you've done a system design interview before on I O, just so that we're on the same page, in the bottom left there is a toggle whiteboard option. Have you had some experience with it?
Nefarious Gargoyle: Yeah, I've been using that one.
Stochastic Panda: Okay, fantastic. So that's good. And then these are some general things. Of course, we're tending to look system design interview general across levels e five or e six. But we're going to pay special attention to some minor deep dives that we'll do in the interest of time, but just making sure that layout kind of makes sense. Any questions on that overall layout before?
Nefarious Gargoyle: Yeah, this sounds great.
Stochastic Panda: Fantastic. All right, so I'm watching the time and let us go ahead and get started. So here's our system design main question. Here we go. So take some time to read it and we can get started.
Nefarious Gargoyle: Okay. Would you be okay if I actually copy this and bring it to the whiteboard?
Stochastic Panda: That sounds good. Sure.
Nefarious Gargoyle: Yeah. Because usually I think it's easier to just put all in one page. Execute. Okay, so for financial trading, design a platform where a client gives you trades to execute and your platform executes trade based on available buy or sell prices from the market. Gotcha. So probably this platform sounds like kind of like Robinhood kind of platform where creating a financial exchange trading app. So let me just put it there. I'll write the functional requirements first. Okay. Do we need to implement the authentication, like at least maybe just basic authentication for the user, whether they can place a trade or not? I'll just write an assumption and if I need any clarification, I'll just ask you, will that work for you?
Stochastic Panda: Yeah. And what I will say, and that's a really good question, is that for our purposes today, we won't focus on authentication and security. We'll actually assume that when we receive what the client gives us, in other words, trade orders, we'll assume that all security and authentication have been complete ahead of time.
Nefarious Gargoyle: Okay, so authentication is not in the scope. Assume it's already secure, and then probably we will be able to put order, I think, in terms of the order, I know because I do a bit of stock investment myself. Do we need to care about the different kind of order types? Because I know there are like market order, limit order, stop loss order. Do we need to put that into requirements?
Stochastic Panda: Excellent question and thank you for sharing that. You have some background here that will definitely help. So actually, yes. For each buy sell, we will assume for a little bit of simplicity, there will only be two types of orders. One is a market order that we'll try and fill, of course, as soon as possible. And as a complement to that, there will also be limit orders. And we will structure the limit orders in the best interest of the customer. In the sense, right. That if they say, hey, I want to buy 100 shares of Apple with a limit of 179, then we only execute that trade if we can find that volume at a price less than or equal to 179.
Nefarious Gargoyle: Got you. If there's available, someone is basically, if you're putting a buy, then you need to have someone who is willing to sell, basically. Right? Okay. Yeah, cool. It's pretty good. Do we need to be able to show their positions of each user or we just strictly just trading part?
Stochastic Panda: So it's a good question. And the way I'll try and answer that is in two parts. One is we'll always assume that buy and sell orders are valid in the sense that if a customer says 100 buy of Apple, we'll assume they have those funds always available. And conversely, if they are selling, let's say, 1000 shares of Apple, we'll always assume that at the time that we receive that order, they have at least that many shares to.
Nefarious Gargoyle: Yeah, sounds good. Okay. Do we need to care anything at all about the UI site.
Stochastic Panda: A little bit, yeah.
Nefarious Gargoyle: If the user comes to Robinhood or any trading platform, usually. Okay, I want to buy Tesla stocks and probably usually they only in the platform they just show the price, the current price of the stocks or they just maybe just display the bid and ask of the stock price.
Stochastic Panda: Oh yeah. In that sense we will not worry about that. We're going to assume that when we receive a potential trade to execute, that the client has already done their due diligence. They already have all the information they need.
Nefarious Gargoyle: To display that basically to the user.
Stochastic Panda: Okay, that's correct.
Nefarious Gargoyle: Yeah.
Stochastic Panda: But one thing that I just briefly is we will want to display to the user whether the trade we received from them is pending, successful or perhaps canceled.
Nefarious Gargoyle: Okay.
Stochastic Panda: M.
Nefarious Gargoyle: So either like pending confirm or like cancel.
Stochastic Panda: Great.
Nefarious Gargoyle: Yeah, I think I have enough of the functional requirements. Pretty much the interesting part will be the market order and the limit order, I assume, and we also assume that user does not need to search from what you're saying is that user already knows what they're going to buy and what they're going to sell. There is no such thing as like, okay, I want to search for Tesla stocks and then be able to find the Tesla stocks and then buy it because we're not displaying even the bid and ask to the user.
Stochastic Panda: That's correct.
Nefarious Gargoyle: Yeah. And go to the non functional requirements. Yeah. So I think it will be definitely highly available because for any stock trader I know, for me myself, I like to invest. I like to make it such that it really sucks when I cannot buy or sell stocks at the point of I want to buy and sell because timing in the market is really important. That reminds me of GameStop where Stephanie Robin Hood just like, okay, you cannot buy anymore. That really sucks. And it's their choice. They're doing that. It's not the system fault. But that being said, I think highly available is definitely needed. I think it definitely needs to be highly scalable just because I'm assuming that the system design that we're doing is going to be something like Robinhood scale or interactive broker scale where there's a lot of users that's using this also. I think it needs to be consistent or transactional maybe is probably the right thing to say because it's financial transaction. So you want to make sure that you got it right and at the same time you do not. I would say, let's say you buy a Tesla stocks for 100 Tesla stocks, but then when everything is finished, then actually you only got like 95 Tesla stocks. So whatever that the order is actually being executed is not correct. So we do not want that. I think we also want, this is kind of like maybe latency is like, probably we want to have a good latency, but maybe not. Like, I think some high frequency trader, like they really require low latency. I don't think we have to go that far. But maybe something like for Robinhood, I'm pretty sure the latency is not great in terms of when they're putting the market order as long as they hit something. Because usually high frequency trader, probably they see one user putting an order and then they're able to front run it just because they have lower latency. But I think in this case probably the latency is low, but not like crazy low. So I think that would be probably a good non functional requirement that we can start off. And I think, I just want to do a bit of calculation maybe in terms of how many daily active users that you're thinking of. Is it maybe, I don't know, maybe like 10 million daily active users. Would that be a reasonable estimate?
Stochastic Panda: Yeah, 10 million, let's say, max, daily active users is a good start.
Nefarious Gargoyle: I see. Okay, Max. So that would be like the worst case scenario in terms of the, I guess storage wise, if 10 million daily active users usually, I think retail traders, they don't go crazy like putting hundreds of trades per day, unless you're, again, like an algo trader. I would say maybe at the most we could say that on average, maybe like five trading per day.
Stochastic Panda: That's a good question. And I liked your mention of interactive brokers. Let's make this problem a little bit more geared toward interactive brokers where there's flexibility. And let's assume that every day there's going to be some traders that are pretty low frequency. Yeah, maybe five trades per day. But let's also assume that there's going to be some daily traders that are going to make up to 100 requests per second, max.
Nefarious Gargoyle: Okay, so how about we say that this high algo, high frequency trader will probably be like 5% of the DAU? Will that be actually a good estimate?
Stochastic Panda: Yeah, I think that's a good start.
Nefarious Gargoyle: Okay, so I think, let me just bring my calculator. Five times 10 million, that's like around 500. 500k users of users high frequency. And I think this high frequency will probably like, we'll just call it like maybe 100 trading per day for these high frequency users.
Stochastic Panda: No, rather, let's assume that these high frequency users will at most do 100. They'll submit at most 100 trades or request per second, right?
Nefarious Gargoyle: Oh, trade request per second. Okay. Yeah, I think if that is the case, since the 100 qs per second compared to the five trading per day per user, we'll just calculate the one with the high frequency because those are the one who is very quick. And I think in terms of the storage wise, obviously, we need to store the transaction data of each of their order and what's executed and whatnot. So we'll just calculate based on that. So that's going to be 500k times 100. And this is per day. So I think 86, 400, which is like per second. I think in terms of the order code. I don't know, maybe I'll just put it like around 50 estimate. And let me just use Google calculator to do this and to get a quick estimate of this equals to. That's really high in number. Maybe I will just. If I can do 500. Okay, so it's around 2.166. And there's another like KKB that's basically megabytes. And I think if you divide it by three and three, basically it's like giga terra PETA. So I think it's around 216 petabytes per day. Yeah, I think this is pretty much. I don't think I have to calculate for the per user, retail users, because I think I could. But I think the estimation of this would probably kind of in estimation rumbi, I would just say like 250 petabytes. Just like do a quick. Maybe 250 petabytes per day. Yeah, so I think that's more like on the transaction data, which is like a lot. So, yeah, we have to maybe make sure that we account for that later on when we do this calculation. I mean, as long as we can do a horizontal scale, I think that's, in terms of storage, it's always easy to scale. I think it's more interesting part, I think, about the financial transaction part. Okay, so with that, I think I'm going to go to the whiteboard and then start drawing the diagram. Sounds good. All right, so I'll start with the user. Just call it like it calls the server. I think I'll just write real quick of the API spec. Yeah, so I think we'll be like, create order. And the order, I think will be in terms of the metadata. I think first we have the order type. I'll make it like super simple order type. And maybe the ticker code. This one will be like Tesla or the TSLA, or like Ba, which is Boeing. And I think also will be the number of shares. Number of shares. Shares. And also, I think for market order, basically, you don't have to put the number for the price, but I think, let's say for a limit order, then it's going to be like limit price.
Stochastic Panda: Right.
Nefarious Gargoyle: This will be optional because we could obviously make it look a bit nicer. But I think, basically, if the order type is a market order type, then you buy whatever number of shares. Or like, if you sell, then you sell whatever number of shares. If it's a limit order, then you need to include the limit price. Either what is the price that you want to buy? Or what is the price that you want to sell?
Stochastic Panda: Okay, got it.
Nefarious Gargoyle: Yeah.
Stochastic Panda: And just so I'm following order type, will that be either buy or sell?
Nefarious Gargoyle: Yeah, good question. My enum is going to. I just want to make it super simple. It's going to be like, market, buy. Market, sell. Yeah, I know. I can put another arguments there, but like I said, I prefer to keep it simple. And this is the more implementation details, like how we want to do it in the back end layer. It's all the same. Yeah. All right. Okay. Yeah, sounds good. So, yeah, now I will just put it here. I go to the server. And the server. Now we'll have to obviously reroute the logic. I think the interesting part here is how do we make it such that the limit buy and limit sell is actually being done correctly? I think the first thing I had in my mind is that we use some kind of radis. I know that actually they are pretty good with this kind of financial transactions. Let me just put this a bit below so we have a bit more space. Yeah, I think Radis would be, like, a good fit. So I think Radis actually, I read a little bit about it, and they're actually really good with a high write throughput as well, especially that we have such a high throughput. So maybe one redis instance will definitely not be enough. What are the ways for me to scale that? We probably want to make sure that the redis may be sharded across the tickers. That's like one way of doing things. We probably want to do something like. Let me just kind of write here, like, sharp redis across tickers. And some tickers might be very hot as well, some codes that may not be very popular. So one chart of the radis could have multiple tickers, but we should be able to actually make it such that if a certain ticker is super popular, then maybe we'll just reserve that one redis for that specific ticker so that we can address a hotspot basically for that ticker.
Stochastic Panda: Got it. Let me ask one brief question here. If we can spend maybe one or 2 minutes on the pros and cons you see of sharding across tickers versus, let's say, sharding across users.
Nefarious Gargoyle: I think the main problem with that is that if you shard it across users and you want to do transactions. Right? And if you have basically, if user a and b, user a go to shark number one and user b go to shark number two, but then the Tesla stocks will be probably in the instance number one, and Tesla stock in instance number two might have different data, might have different limit, order, buy and sell that's available. That would be kind of a bad experience and technically not right, because you want to make the market fair in a sense that it's bad for the trader to like, hey, that trader, actually, he come in later, but he got a better deal when my order comes in at the same time, although it's not exactly visible to the users, but I think that it's a bad user experience in a sense.
Stochastic Panda: Got it. Yeah. And that kind of makes sense with respect to your good bullet point on consistency. Okay, cool. Yeah, let's continue.
Nefarious Gargoyle: Yeah, let's continue. So I think the other thing that I want to touch on, on this, we need to, obviously some persistence. This persistence could be like, there are multiple ways. I know Redis has an integrated CDC, so we can leverage that, or we can just do something like maybe a bit more old school is this. We also send these logins to the message queue over here. So we have maybe like a message queue to actually send all those transactions over here. And this one will be a lot slower. So obviously we cannot use this, utilize this for any of the updating the order status. And if the order is filled or anything like that, this one will not use this. This one is mostly on the database that we're going to use for persistence and also maybe for, I don't know, maybe data processing or something like that, that we want to do later on as maybe Robinhood. They want to aggregate all the data later on for their own purpose. So this one could mean, I think we'll just do something like a mongodB, or maybe if not MongoDB, then I'll probably use Cassandra, because I think I heard Cassandra is actually, maybe it's still no SQl, but I think they are better in terms of high writing throughput compared to reads. Mongodb I think is more like they can support. I think if you scale them enough they can do this as well. But from what I heard that they're better as a high read, high write. So maybe something like Cassandra will actually be good for us to store the logging transactions into Cassandra later on for it to be processed. And the one I want to touch on over, I think over here in Redis. So Redis is kind of in memory and the one I want to touch on is more on the limit order and limit cell. Oh, actually this might be kind of tangent to the functional requirement for limit order. Are we just putting the limitation expiration to just to be one day?
Stochastic Panda: Good question. And yeah, for the sake of this scope, yes. Let's just assume it's for the day.
Nefarious Gargoyle: Sounds good. Okay. Yeah. So I think if that is the Case, if we are using something like radis to act as limit order, like liquidity. What I mean by that is that from my understanding, Redis has some kind of like a sorted set or I think like a z range. Yeah, it's not really the top of my head, but I think basically with that sorted set we are able to sort the order of both the limit buy and the limit sell and be able to put a sorting between of those ranges of the price. Let's say the user sends, let me just write this down. Let's say we example sends limit order by TeSLA, 100 at price, I don't know, $200. So I think what we have here, we will have a sorted set here to make it like, I think the way that they do it for a game scoring like a leaderboard. We can use this data structure here to be inside the sorted set. So it's going to be TESLA and then going to be like 100 shares buy at the price 200. That will be like inside the key, maybe of the key will be TESLA buy and then like 100 shares at $200. I think we're going to sort it, I think based on the price. And then you'll be like the volume of that shares is going to be sorted from low to high. Sorry. It's going to be sorted based on the price first. So let's say when you go put this order into the redis sort of set, then I think they will try to find the one that goes to the Tesla as well. Tesla. And we will find with the key maybe sell and then we'll see that, whatever, that's probably if the price here, maybe they have, I don't know, $199, maybe they only have 50 shares. But since it's sorted then basically then there's going to be more sorted set over here where maybe it's going to be 201 or maybe just 200 and that's going to be, I don't know, another 50 shares. So when they go to the redis and look into this and they'll be able to buy these two blocks of limit sell and then they'll be able maybe and we'll just return back to the user. Usually, I think interactive broker, they usually say that, okay, you have fulfilled order at the average price of such and such, which I think maybe I should describe that a little in the API spec. So it will be like return, it will return order maybe like the status average price field or so on. Obviously, if, let's say in this instance where the buyer, they decided to like, okay, I just want to buy at 198. In that case is that the limit sell is not going to be hit because when you see the radis and it's like, okay, there are only sellers who are willing to sell it at like 199. So what is going to happen is that radis is just going to put it there as like a buy in the buy limit order section sets of the radis, which we're going to put it in Tesla and then we're going to put buy 100 shares and then we put the price there as like 198 and then that's going to be sent back to the user as the status is going to be like, I guess pending. So I think I'll just put it here like pending field and I think canceled. If you want to maybe try to cancel. And I guess if that is the case, sorry for me going back and forth, but yeah, you might need to be able to cancel order as well, which I think maybe we not really touching the functional requirements, but I think that makes sense. I think we should have the order id. There you go. Cancel.
Stochastic Panda: Got it.
Nefarious Gargoyle: Yeah. And then we'll just return back the same thing with the same, that is going to be canceled.
Stochastic Panda: All right. And these are good points. And maybe just one note, we're not designing the exchange, so to speak, just in case you were going on that path in the sense that we'll interact, of course, with we can assume one existing exchange where we can always say, hey, I've got a customer that wants to buy 100 at 198, can you fill it or not? And then we'll get a response from the market that says yes or no over time. And also, yeah, we do want to make this somehow asynchronous as well with respect to users.
Nefarious Gargoyle: Yeah, I think good call out. Yeah, I think if that is the case then usually then we'll just like, okay. Yeah. Because my thinking that if we use redis, that's pretty fast, we should be able to fill it immediately if we cannot find it. I think that if it's not fast enough then probably before we going to be well, let's say we could have another message queue to maybe do asynchronous. We could just maybe return immediately as like, well, I think we could return the order as pending first immediately from the server. And then instead of we call the radis. We call the radis here directly. My bad. It's kind of small. So we could have something for example like this is going to be a bit isolated. We just have call it like transaction service. So we could have a message queue that actually like, okay, this message queue will be the one to actually like one. It goes to cassandra. Another one just goes to transaction service. This way. I think it's a bit little lower latency. But again, I think we have just described it that maybe we do not need like a super high frequency. And this is like for retail platform anyways, this transition service is going to be the one to like, okay, we're going to try to go here and try to fill the order and if we are able to fill it, then we'll just maybe tell the server that, okay, the user actually has filled the order. I guess it could be like we just go to message queue and this one usually if you're Robin Hood or interactive broker, I think directive broker doesn't really have it. But if you have a Robin Hood, you'll be able to go to a notification service later on like notification service. This could be something that connects to the Apple push notification service and that will be like go to user. That being said, I know there are people who are staring at their app all the time to see whether their order is filled or not. So we could utilize notification service as a way for them to know that notification is filled or not. Or we could also have a websocket that we can connect them to the server so they can have immediately that feedback of like oh, your order has been filled or your order is now still pending. So that will be a lot faster for them. I can add that over here if that's something that you are interested in as well.
Stochastic Panda: Yeah, and you anticipated one of my questions here, which is the trade offs that you see between drawing this arrow from basically transaction service to message queue to notification versus having an arrow that goes from transaction service to server to use.
Nefarious Gargoyle: Yeah. So I think that's like if we use WebSocket, we can definitely do something like, okay, immediately you go back with the use websocket here as well. The pros and cons is like one pros for using WebSocket is definitely lower latency and lower, I guess a better user experience in general doing for the message queue will be it's still functional and available as well, and you will still eventually get it. But the only downside is that, let's say you buy Tesla stocks for 200 shares at the price of like 200, and it was there, but the order that you see is still like pending. And you kind of got, as a user, you might got impatient and like, you know what, I'm going to try to cancel this and I'm just going to change the limit order to be like 201 or 202 or something so I can get it filled. And what can happen is that the order before was actually filled. And actually this happened to me one time when I was investing. The order before was filled. And then because you try to create a new order, you try to cancel it. That didn't work, and you try to create a new order, that would be a problem because now you bought 400 shares and now you pay extra commission fee if you're going to sell it again to the broker. One thing that you can try to minimize this, at least in terms of the modification, is that we can use something like an item potency key. So let's say when you put the order right, you put the order to the server. The server will hash it out in terms of the order id here, maybe we can, I guess, use it as the item potency key here. And then when you buy the 200 shares of Tesla, and then sometime later on you want to like, okay, I want to change it, then maybe using the item potency key, then that request will just fail. It's just like, okay, you already buy, it's already filled. We will not allow you to buy another one. That doesn't seem what you want to do, but it's a different story if you try to create a new order because you get impatient. In that sense, using a websocket is better. At least we minimize that latency. When you buy and you got it filled, then you immediately be able to get the feedback and you're not going to accidentally create the new order. The cons is that it's going to be more resources because of the especially we have a lot of throughput here, a lot from the high frequency trader, and that will require you to create multiple webSocket servers, add more complexity in terms of managing which user is going to connect to which WebSocket server, which is not impossible, but just adds more complexity. But yeah, if that is like a hard requirement for the user experience, that can definitely be done. I think in this one current system design, at least we got something functional and maybe we can still minimize the error from the user side.
Stochastic Panda: Got it. Thank you. Yeah, I think those are good pros, cons on the utility of websockets, but also the added complexity that they bring. And basically in the remaining time, let me ask about two topics. The first one is a sort of follow up on websockets and how users are interacting. And my question is how will you handle load, let's say at peak times, maybe if we assume there's going to be a lot of trading in early morning, like 09:30 a.m. Eastern, and maybe a lot of trading peaked at near market close, like 04:00 p.m. Eastern or so, but maybe not as much trading around lunchtime, how would you go about handling peaks in transactions per second?
Nefarious Gargoyle: Yeah, good question. I think I would probably add a load balancers probably somewhere between here and I use a load balancers over here. And again, I think we previously touched on like hey, we use a shard for each of the ticker. I think that idea can also be used over here. Or maybe either that, or we just make it simple. We just scale up the server over here with just like a round robin algorithm. Or we just use some kind of consistent hashing using the user. I think in this case it's okay, we use the user, maybe user id and the buy order, for example, and ticker code as consistent hashing that you can scale up over here so that it can handle the trueput. And then I think for the message queue over here, I think this is the part that we want to make sure that we can scale it up as well. Again, I think how we can probably, this one is going to be like the big bottleneck over here. If you want to just keep one message queue. We probably want to separate the message queue into multiple message queues. And again, the other way of doing this I think will be to make it align with the transaction service. I would probably just use something like per ticker or maybe a combination of tickers of the low volume ones. But the one that's high volume will probably be something like Tesla. We probably have its own message queue and probably we're going to want to make sure that we can have an auto scaling cluster as well. And I think most of the AWS sqs, they have an auto scaling cluster that we can deploy or we can just manually schedule that maybe half an hour before the market time. We just calculate what is the maximum amount of qps that's probably going to be going to each of the service over here and then we'll scale it up so that all of them can actually do the transactions redis. I'll probably be a bit less concerned because I know that I think why I read in Redis enterprise they can go up to 2 million query per second, which is very high compared to our qps, even with the highest frequency trader. So that one will be probably something I'm less worried about. It's mostly on the transaction service and the message queue and then the server over here, the Cassandra one I think should be fine as well to shard it, maybe a couple of them. But because this one is more like kind of like a data processing, like post processing transactions, I'm not super concerned about it. Unless whatever machine learning that we might have to do overnight, since market time is only like nine to five and then you're done. We do not need to create a big multiple consumers to consume the message queue fast enough, at least for the Cassandra part. And I think that should be doable.
Stochastic Panda: Okay, cool. Yeah, and thank you for that. I think load balancer makes a lot of sense and I like what you mentioned on auto scaling. So let me ask my basic last question regarding the topic on a database. So my question is that we do want to store some history of the orders that we received and whether they were filled. And yeah, in principle we could use something like Cassandra. And my question is, were you thinking of a NoSQL or a SQL setup to store the records of these transactions?
Nefarious Gargoyle: I think I was thinking both with the SQL and NoSQL. I think to be honest, over here we can definitely use SQL as well with the transactions. Although from, I think Cassandra also definitely supports transactions as yeah, since both kind of do transactions. So you can do either one right now because of technology like MySQL, for example, I know planet scale used VTex to be able to scale whatever mantra throughput. I think Cassandra is just kind of like the old school way of NoSQL way of high throughput and still able to have a good transactions consistency. I think the only part if I really really want to go for MySQL is that maybe if we're doing something like a payment between users like Venmo, maybe if we have something like, okay, a metadata of user has friends with this guy and then you need to do some kind of transactions and you might need to have kind of a table join of transaction between users. That is maybe the good point of us. Maybe we should use maybe some kind of like a MySQL with a Vtest vitas of technology on top of it to make it scalable. But I think in this scenario, since user and orders is know we have a very simple relationship between users and buy on the orders and we don't have a graph kind of network like spanning social media, I feel like Cassandra is probably better choice. And yeah, I think the only consideration then in terms of NoSQL is either we can choose Cassandra or like MongoDB. I think from my understanding Cassandra, since the formatting of the table formatting is like column formatting instead. So it's like faster in writing. So it's better for writing high throughput compared to MongoDB where they are more like JSON object and it's easier to read and scale, but not so good with just writing. Maybe MongoDB is maybe easier for most people, engineers to develop on because it's more user friendly compared to Cassandra. I just feel that that's why I think Cassandra will probably the best fit for this one.
Stochastic Panda: Got it. Thank you. All right, so I think we're at a good overall stopping point. I'm just checking the time and yes, I think we did well on time.
Nefarious Gargoyle: Cool.
Stochastic Panda: So if you're open to it, let me take maybe three or 4 minutes for initial kind of high level feedback. I'll incorporate kind of like a recap there of what we covered. And then I do want to make time in case you have some comments or questions for the last few minutes of our time today. Does that sound good?
Nefarious Gargoyle: Sounds great, yeah.
Stochastic Panda: Excellent. Cool. So yeah, as a recap, I think we covered a lot of ground, right? The functional requirements, non functional and calculations, especially on the API spec. I think that was one of the highlights in terms of what the user should expect in terms of what they need to give us and also in terms of what we need to provide to the user. And we also got through the main diagram. So on those points, covering a lot of ground in terms of breadth, that was really good in terms of depth. I think you showed a lot of depth on distinguishing, let's say, between websockets and the message queue. For me, that was a highlight, especially given the complexity of the problem, right? That on the one hand we have users that are maybe just going to make or submit five requests in a given day versus other users that might submit 100 requests per second. And that can really add up. So those were some highlights there. Yeah, I took some notes. The one nitpick that I would make is kind of halfway through the design here. We jumped a little bit back and forth between the API spec and some things that you were writing down. And to be honest, I think a reasonable interviewer will kind of assume that's natural sometimes, right? As we develop the system, we might say, hey, by the way, now that I've got this building block, let me come back to the API spec and add some things to it. So I think that's perfectly natural. A reasonable interviewer should not ding you for points there, in my opinion. But if after, let's say, going through functional and non functional requirements, if there's an opportunity for you to say, hey, let me spend some time on the data contract and API spec and make whatever simplifying assumptions you need to to get those APIs basically down, then it might be easier to design the overall flow afterwards so that your flow is kind of restricted in that sense. Right. So that's one suggestion I might make there. And also it might be a bit of a nitpick, but interviewers will differ on the following is with respect to what we want to store or write in our database, that might be another place that you want to include perhaps in your functional requirements, right. Is we didn't touch on it directly, but implicitly in the UI, we said, okay, we are going to be a bit sparse or spartan with respect to what the user sees in their UI. But as we are storing their trades, we might raise the question like, okay, as a user, let's suppose that today is Monday, and then tomorrow, Tuesday maybe before the market opens up, I want to get a record of everything that happened yesterday on Monday. Right? And how would I do that? Well, if I made a simplifying assumption on my API spec, then a SQL approach might be good to do, because then when the user queries our system and says okay, here's my user id, here's my time range. Now give me all of the records that happen. Basically a columnar setup might be simpler than something more robust, like a topics handler, right? Sometimes Cassandra or other NoSQL options can shine when we're considering things like topics and we might know documents of varying length. But in this case we could make the assumption that we're going to say, hey, we're going to be very atomic about this. And our primary sort key, if you will, for lookups from the user perspective will be the user id followed by let's say like a time range and that'll be kind of sufficient. So that's kind of just another nitpick. You I think we're preferring Cassandra, which is totally fine, but if there's time in the interview, say like hey, limited time, I went with Cassandra, here's why. But another way, if we were restricted to using SQl here would know one way to do it and my primary key would be let's say order id or user id, something like that to give a complementary sense. So that's my initial high level feedback there. And finally I would say again, maybe a nitpick, maybe more important for others, is on fault tolerance and load, right? So we did touch on a load balancer, but sometimes it's good to consider like okay, I've got this load balancer, I know things are going asynchronously and I'm sending trades from millions of users, but what happens if a node in my server goes down? How will I handle it? And there you can get into things like, well from the user perspective, right? I want to make sure I keep or store that order somewhere, but maybe I can rely on some redundancy or some multi geography things such that some other node in my server tries to match the order with a market. So that's one thing to maybe keep in mind when we're talking about load. And load balancers is great, I'm able to handle all this traffic. But what if something like a node goes down, how would I handle it?
Nefarious Gargoyle: Like a zookeeper basically to keep track of the health checks of each of the instance. And we can able to replace the instance with another instance that can have the copy of the data.
Stochastic Panda: Yeah, exactly, things like that. And that can also kind of tie back to the database, right? Like if we can store things in a kind of good fault tolerant way in our database and then have our nodes interact with it to go to the market, that could be another discussion point. But anyways, pretty big problem that I provided and that's my initial feedback there. Any questions or comments from your side?
Nefarious Gargoyle: Yeah, I think, I mean just a question. Do you think this will be a pass as an e five or do I have to, I guess for that? Other things that I think you touch on just now in the feedback. Is that something like crucial for me to pass the interview?
Stochastic Panda: Yeah, for, for e five, I think I would lean towards yes. And I think to get to a, let's say even clearer. Yes. Is some of the generic things that they sometimes look for in e five and system design, which you touched on in the non functional requirements, like highly available, highly scalable, consistent. What I would recommend to cover those bases is basically once you've got a simple API spec and maybe some of the building blocks, not necessarily all of them in the design flow, then you can say, hey interviewer, I've got these four things in my non functional requirements. Which of them do we want to double down on? Can I double down on, let's say high scalability? And a good interviewer would say like, yeah, definitely, let's do that. Or they might pick something else so that you can then really demonstrate and as some people call show off maybe on your knowledge of load balancers or things like that. That's one thing I might suggest there.
Nefarious Gargoyle: Yeah. So basically I need to just ask a little bit on the interviewer side, which of this maybe like you said, fault tolerance will probably not the one that I should have touch on as well for the non functional requirements. And then maybe just ask which one of this that we definitely like. The highest priority in terms of non functional requirements.
Stochastic Panda: Exactly. Yes. And to do that, right, we have limited time. So to volunteer both the kind of entire building blocks like you did, and to try and go through all non functional requirements, that would be pretty tough. So what I would recommend is when you have for example a few of the building blocks like user transaction service server, and maybe one of the databases, like three or four of those main components, that's a good stopping point where you can make sure that you're covering some of your non functional requirements.
Nefarious Gargoyle: Got you. So maybe after I draw these diagrams, then I'll just go through each of the functional requirements and see which one that I will probably try to make it even more, I guess fulfill the requirements in the functional requirement.
Stochastic Panda: Yeah, I think that's a good strategy. And one reason though, in my case, right, interviewers will differ. I didn't really interrupt you because I like the flow and I wanted to see more of a test on breadth for this interview. So that's why I didn't stop you and say, hey, tell me more about this. But it was kind of toward the end that I started asking about load and load balancers. In that sense, I think you did well. And that reminds me, actually, when talking about load and load balancers, one thing that I think would get a lot of bonus points for an e five is that if you talk about not only the max, let's say queries like in a day or like Max storage, but let's focus on queries briefly, is if you could focus on the percentiles to say, okay, in a given hour of, let's say, 09:30 a.m. To 10:30 a.m. I have an idea what my max queries in that hour will be, but let's think about the p 99, p 95, or p 90. Those types of percentile discussions. When talking about load, I think will get you a lot of even more positive signals.
Nefarious Gargoyle: Got you. Just basically, we talk about that maybe in terms of the calculation part.
Stochastic Panda: Exactly. And the way you started, perfectly fine in my view, for any five is, okay, let's talk about Max and for example, storage. Yeah, allocating some max storage is perfectly fine, but the percentiles become a little bit more critical when we're talking about throughput and how much compute or nodes we might need in a given peak hour. Yeah. So that's when percentiles can be good to talk about.
Nefarious Gargoyle: Okay, so basically then from there we can calculate how many instances that we might want to divide them based on the p 99 and then p 90, then we'll have this. So we can just talk about scheduling between the 90 to 99 based on those times.
Stochastic Panda: Exactly. And in my view, I was kind of satisfied with that. I didn't press you on the server. Like, how many nodes do we need in the server again? Because for today I kind of wanted to focus on breath. But once an interviewer starts saying, hey, tell me more about load or load balancer, then that's a good opportunity to maybe go into those percentiles.
Nefarious Gargoyle: Sounds good. Sounds good. Yeah. Thank you. Thank you for the feedback. Yeah, I think that's like really good tidbits there. And I had a really good time with these interview questions.
Stochastic Panda: Yeah, definitely. I hope you enjoyed it. And I like this question a lot. It's not too common, I think. And last thing I'll say is, just for general practice, if you look at some of the so called showcase interviews on this website, you might have seen them, but there's a few pretty good ones. There's one guy that really likes the food delivery app question, and that's another good one to practice with, but I think he's asked that question so many times, I wanted to make sure to ask something else. Anyway, if you want to look at a complementary problem, you can look at those as well. Otherwise, it was a real privilege to speak with you today. And thank you.
Nefarious Gargoyle: Yeah, thank you. Have a good day, too. Thank you very much for the interview.
Stochastic Panda: Of course. All right. Thank you. Bye.

We know exactly what to do and say to get the company, title, and salary you want.

Interview prep and job hunting are chaos and pain. We can help. Really.