A System Design interview with an Amazon engineer

Watch someone solve the amazon ranking system problem in an interview with an Amazon engineer and see the feedback their interviewer left them. Explore this problem and others in our library of interview replays.

Interview Summary

Problem type

Amazon ranking system

Interview question

1) Design a system to rank products within categories by a rolling average of their purchase count.

Interview Feedback

Feedback about Dialectic Singularity (the interviewee)

Advance this person to the next round?
Thumbs down
How were their technical skills?
How was their problem solving ability?
What about their communication ability?
Thanks for taking the time to interview with me today, We spend the last 10 minutes of the interview talking about feedback and ways to improve, but I want to sum up a bit of what I said to you here 1) make sure to ask more clarifying questions, if you aren't sure on something please ask, even if its not time to ask yet, its very easy to just skip something and then not ask all the way through the interview and the answer they give you could change the design, or worse because you didn't ask they could tell you something specifically that would break the design that you already have going (intentionally) 2) when you draw out your high level design make sure to start with a specific place that the request starts/ends, you started out with a load balancer and didnt specify anything about the origin of the requests. this is in general bad form and you should always start off with the requests origin, This can be risky because the interviewer could always segway that into questions about DNS, but I think it looks good and shows a clearer understanding of the full lifecycle of the flow of information you are modeling 3) Calculate usage!! you totally skipped this part, I know it can feel tedious but its worth it to just spend 2-4 minutes calculating how much usage is going to be used by the service, I feel if you had done this beforehand you would have realized your scaling into a single redis wouldn't have been sufficient for the needs of the reads for 100billion requests per month, (aprox 40k requests/sec of uneven traffic) 4) state assumptions that you are making, I felt that some of the things you thought about during the interview you took for granite and you didnt verbally talk about them (such as how the database was going to store the archived cach in a separate table with your design, I understood and it was implied, but its good to state the things you are assuming). Also its good to state assumptions about traffic patterns and load, (for example a service like the one we talked about would NOT have even traffic throughout the day, it would have serious peaks and vallys Overall you were really close, you did a good job at talking and saying what you are thinking, that is a really great trait! keep it up! Best of luck with your interview tomorrow!!

Feedback about Fabled Goblin (the interviewer)

Would you want to work with this person?
Thumbs up
How excited would you be to work with them?
How good were the questions?
How helpful was your interviewer in guiding you to the solution(s)?

Interview Transcript

Fabled Goblin: Hello.
Dialectic Singularity: Hey, how's it going?
Fabled Goblin: I'm doing well, yourself?
Dialectic Singularity: Pretty good!
Fabled Goblin: Awesome. Well, welcome. Welcome to the interview. So before we get started, when I went to go check on information about you and what this interview was for, it just says, you want to practice high level design, and it says that it has no other information. So before we start, could you tell me a little bit about yourself maybe what experience you have, what you're studying for, so I can better plan this accordingly to help you meet your goals.
Dialectic Singularity: Yeah. Yeah, of course. Thanks for I guess, like doing that. So I've been a developer in the Bay Area for about four years. And I guess I'll just kind of tell you the companies. So for the first year and a half, I was a big data engineer at Salesforce. And there, I did, you know, kinda like Hadoop, Spark stuff, and some database tunings. And for the past two and a half years, I've worked for this startup that worked in real estate. And that was a lot more of vertically, my work is a lot more like vertically integrated. For the entirety the time there, I worked on the pricing team. And the majority of the time there, I worked on the resale pricing team. And there, a couple of projects where, I have some more like technical and projects that are more product base. So a technical project I did was redesigning our, like unit economics pipeline to be more accurate for the homes that we have in our inventory. And a more current product base work I did was we making, basically making the entire process by which we price homes for resale? So originally, it was very, I guess, I would say organically grown and and so it got inefficient. And then we decided to read through it with like, you know, models. Yeah, so that's some like, history. And I guess in terms of like, you know, what I'm, like looking for... So I have an interview tomorrow with Stripe, and I'm interviewing for a back end, I guess, like financial data API role. So yeah, I guess a something with back end. And you know, maybe APIs would be good for this.
Fabled Goblin: Okay. And, um, you I saw you switch to Python right off the bat. Generally, with system design, it's a little bit less writing code so much as more high level design, do you want to be more specific into API writing and like the nitty gritty of it, or do you want something that's more high level system design? Like when I would give someone when they're interviewing with me?
Dialectic Singularity: Let's do a high level system design. Yeah, I think I just switched to Python just by habit. But text is probably better for this.
Fabled Goblin: But I just want to make sure, because if you wanted like a, like an in depth, like design a specific type of thing that would be coded out, I can do that too, but I just wanted to make sure that we're on the same page.
Dialectic Singularity: So yes, you just high level.
Fabled Goblin: Okay. Can I give you a question that I like to ask people at Amazon? I know it's not Stripe, but I'm sure that there will be some similarities. And, yeah, I have a couple things. First of all, would you like to design something that's very, that's going to be very specific to design a thing? Or would you want something that just covers system design in general, that's more of a generic, generic thing to design?
Dialectic Singularity: Um, I think generic, I guess. Yeah. Like, I don't know what you know, the Stripe system designs going to be. I couldn't find any like, well, Mmm hmm.
Fabled Goblin: Okay well I'll give you system design that I would normally give at Amazon.
Dialectic Singularity: Yeah, that's fine. That's very fine.
Fabled Goblin: Okay. I'm sorry. I just don't know what I don't know what they're gonna ask you at Stripe.
Dialectic Singularity: But yeah, well, well, I just like was looking and like to see if I could find like, example problems. And I actually could see a couple, but I mean, those I can look at myself. Yeah, just give me a just, you know, give me something new.
Fabled Goblin: Okay, and if you've heard this before, tell me and I'll give you something else. What I don't want to have happen is you just regurgitate something you already know. Because that's not very helpful.
Dialectic Singularity: Yeah, yeah, for sure.
Fabled Goblin: Okay, so on Amazon, when you're browsing the website amazon.com. And you look at products, you see the products are ranked within categories. So if you're looking at like phones, you can type in like iPhone six and iPhone six has some sort of rank within cell phones and then some other rank within electronics. And it might be in a few categories. It has different ranks within each category. I would like you to design the sales rank by category feature.
Dialectic Singularity: Find the sales rank by category feature.
Fabled Goblin: Yeah.
Dialectic Singularity: Okay. So I guess my first question is like, what's the scope of this? By which I mean, you know, there's obviously a frontend component where you have to, like, I guess, like, look at it and also a backend components, where like, you know, ranking mechanisms.
Fabled Goblin: So let's assume that you don't care about the frontend, some other frontend UI person is going to deal with all of that, your service just keeps track of where products are ranked within a specific category. So let's say that you are given, let's say, Amazon, hypothetically has maybe like 10 million products, and 1000 categories, but they can be a many to many relationships. So each product can have multiple categories, and each category can have multiple products.
Dialectic Singularity: Right. Got it. Amazon has 10 million products and 1000 categories roughly. Okay, cool. Um, let's see how...
Fabled Goblin: Since you're only dealing with the back end, I don't want you to think about any front end stuff, just assume that when the page loads, it sends a request to your service to get the rank of products. So if someone's looking at a specific category, they should be able to get the product from the category, if someone's looking at a product, they should be able to get the category from the product, they should be able to find the ranks of that in any combination. If someone buys something, you can assume that it recalculates its rank, and rank is a rolling average, it's not going to be something that is set in stone. So for example, let's say that the reason the rank of the rolling average would be because the iPhone seven was really, really popular a few years ago, but now no one's buying it. So, so yeah, so the thing now that sold really, really well, one time, they're not going to be stuck at the top forever.
Dialectic Singularity: Right. That makes sense. Um, let's see. So you say, when somebody like buys a new thing, we want to update this. So I guess I in terms of latency, we want this to be like, as accurate as possible?
Fabled Goblin: Um, well, I think that having everything instantly updated would not be feasible, just because of the volume that Amazon does. Let's say a billion items are sold every month. Trying to make the numbers easy, like to work with, um, I don't want to give you any weird numbers like, like, see, like 10 million items, 1000 categories, 1 billion transactions, like per month. But results are not to be updated instantly, but they should, everything should be updated a minimum of hourly and popular products, or things that have a lot of volume more frequently than hourly, if you want to do that.
Dialectic Singularity: Okay, everything should be updated hourly, at least. Yeah, since it's hourly, then batch processing is possible. Cool, okay. So I think I understand kind of categories. Or I kind of have everything I need. So you know, very roughly Amazon like 10 million products and like 1000 categories, many to many relationship, we want two endpoints. One you get product, I'd say product name or product ID and return like a JSON for example, of like the top categories. The other ways we send a category and it gets, you know, returns like top hits. And we want this to be updated in roughly hourly. Cool, and just for, just make sure, so for these products, like on our back end, I'm sure like can I just like assumed there's some like API that I can get to like get new products, and like their information and price. I have two questions, one, how do we find out about like new products? And two, like, how do I get the I guess, like sales information?
Fabled Goblin: Okay, so for the first one, let's assume that there is some other API whenever a new product is added, its categories are selected, and it will just be sent to you, you have some sort of API that just calls you and says, hey, this product is being added with these categories. So you can either add a product or update a product with that API. And sales are generated whenever there's a transaction that happens. It's sent to you through some sort of sales API, and it says there is a transaction with x product ID and x x number of units sold.
Dialectic Singularity: So under this transaction that happens, it send me the information.
Fabled Goblin: Yeah, it's something we do. And you can take exactly what's going to be in that transaction data that gets sent to you what you need, but it should cover everything you need. So if you want maybe you can make a list of the things that you would need within that transaction and to make sure.
Dialectic Singularity: Yeah got it, that makes sense. Let's see. Okay, cool.
Fabled Goblin: The very final thing is, you should assume that people purchase only about 1% of things that they view.
Dialectic Singularity: Yeah. Okay.
Fabled Goblin: Once again, trying to keep the numbers nice.
Dialectic Singularity: Like, and I guess the final thing is, what was I going to ask... Um, so popularity is literally how many they buy, right?
Fabled Goblin: Yes.
Dialectic Singularity: They bought or they bought within the past x days. Cool. Okay. Yeah, that makes sense. Um, so let's start, you know, thinking of it from the, like, final end user perspective, I guess. So, you know, ideally, they want these two API's. Let me just draw some things. So here, I guess it's like a fine, oh, well, I guess we can start with a load balancer. And say, this is like the load balancing that like, hits from the front end. And then behind this, we can put a server that will return the response. And then from the response itself, since we have, let's say, 10 million products, and 1000 categories, that's not too many. So we can probably, you know, put a cache that, like returns our API requests, and does make things a lot faster. So we put cache here, and we can probably use a, you know, key value store like redis, since it's not too complicated. Redis cache. So, you know, the, I guess the question is, like, how does this like Redis cache, get, like populated. And so in our very, very backend, there's this database, right? And this database stores all our, but you know, stores like, information about the products, like, you know, which categories they are, there's also things like transaction history, and product info. And then from this, we can actually find out, you know, what's the most popular things are? So of course, given that, uh, you know, like, actually querying this database would kind of take a while. So we need, you know, some sort of like, batch job.
Fabled Goblin: Okay, um, what kind of database would you choose?
Dialectic Singularity: Yeah, um, so, I guess, let's see, um, so for something like the product information, I think you should definitely have a relational database. And, you know, the product information isn't very, there's not like a ton of rows there. So like, scaling won't be an issue. For the actual transaction history... I, let's see, like, that's a lot of rows. And and so I'm not sure a relational database makes sense. That... so for that, something like a row based database might make sense. So all depends on the... it all depends on the access pattern, right? So in this instance, if we're using this database exclusively for this batch job, basically, we only need to get the rows within the past, like, whatever, I guess, like x days, and given that, that is fully ordered, or pretty close to fully ordered by the, I guess, like order we received them, like a kind of row base time series database, or a column based like time series database, like Cassandra or Redshift might make sense. So I guess, there, it might make sense to actually have two databases, one, like Redshift, kind of time series database for the transactions, and then a second, I guess, like a relational database for the records. The kind of issue with that is then like, you can't really join them. And then so you, you might have, like, need to do two things. Like one is a group by of your transaction history gets some numbers, and then like, secondly, you then like query your relational database to actually like create your ranking, but I think that's fine. So yeah, so let's say here, this one can be like we call it like relational DB. And something I think Postgres is fine for this product info. And then this is I can't, I don't know how it's gonna be on this. So we'll call this the time series db. And this has a transaction history. Right.
Fabled Goblin: I can't see what you're typing until you already until you finalize the text.
Dialectic Singularity: Okay, yeah. So let's see like it, there might be a way to like, only have one DB, because having two is kind of iffy. But you know, like, I think, I think this is like, a nice way to think about it is that you have like, these two types of data, you know, ones like time series base and one more relational. And then it might make sense to have two databases.
Fabled Goblin: Alright, so you have two databases, that you've just described. Let's say that your one service that is being called by... I don't really see where your server is, you have a load balancer response, kind of a Redis Cache? Where does your server itself sit in here?
Dialectic Singularity: Oh. Yeah, that's response. Okay, this is their response, it's your server?
Fabled Goblin: Yes.
Dialectic Singularity: I just want to make sure, some people might say it something else after that. So then the response is going to go? How is your response connected to the Redis Cache versus the batch jobs of the batch? The response always query to the Redis Cache? And then the batch job just updates the cache periodically?
Fabled Goblin: Yeah.
Dialectic Singularity: Okay, um.
Fabled Goblin: Actually I guess, so you know, caches can go down. So it might, it definitely makes sense to also, we can put this in a relational db. The batch job should probably update the Redis Cache, but then also, let's store like, the information to, I guess, like rankings here. We should store ranking somewhere in some db, just so that if the cache goes down, we don't have to rerun the batch job, we can just get it from the database. Yeah.
Dialectic Singularity: Okay, so do you have a separate service, then that's calculating the rankings? Or is your server response also calculating the rankings from the batch job?
Fabled Goblin: No, the batch job runs every, I guess, like 30 minutes or hourly to calculate the rankings. And that updates the Redis Cache.
Dialectic Singularity: Okay, so the batch updates the Redis Cache and gets them. Okay, that makes sense.
Fabled Goblin: Yeah. Cool. So we... let's talk about I guess, like, how we actually, like, update the transaction history. So yeah, so to your point, I think you said, you know, there's like, there's like, you know, as it happens, it sends something to me. Um, so I think updating the database every time that this is like kind of dicey. Just because the database might be down, you might not need it all the time. So one thing you can do is have a queue. And also because you don't need like new transactions immediately, because your batch job only runs over, like, whatever, like 30 minutes. So you can have a queue. I think Kinesis or Kafka is a popular one. But the basic premise is that every time there's a sale, this, you know, you write to this, and then yeah, so let's see. Yeah, that's, that's also just how it goes. Um, I'm just thinking about, like, whether or not we want to have a, I guess, like, not just the queue, but basically, something that says, like, Hey, your message has been received by the queue. I don't know if that like comes out of the box with Kafka or Kinesis. In the sense that, you know, like, that's the thing that's nice about queue is you can like, send it and like, forget it. But it would be nice if, like, if you like send it and the queues down, the sender gets a response that's like, Hey, this is down. Um, let's, let's come back to that later. Because it's not like it might not be necessarily. Yeah. But you know, that's cool. So we have a queue. And then separately, we have like a, we'll call this like ETL batch. Or just a yeah, ETL batch job. And what this ETL batch job does is it'll take information from this queue. And let's say update the transaction history. Yeah. And then there's this other part, that's when finding a product there's some API, that product hit again, just hit for that one.
Dialectic Singularity: So going back for a second, the ETL batch job is... So you're queuing up all these transactions, whenever a transaction happens, it gets thrown into the queue, is ETL, like it basically drained the whole queue and then update the database as a batch job. Is that what you're saying?
Fabled Goblin: Yeah, so it just, uh, yeah, I guess exactly that, it does, like a bulk, write to the transaction history.
Dialectic Singularity: Okay, so that way you don't have to do individual writes, you're doing... occasional writes and have a lot more data. Okay, that makes sense that way we have less writes. Gotcha.
Fabled Goblin: Yeah, cool. So the other part of this is like, how do we find out new products? You know, this is some new API that we can hit. Um, let's see, given that we also might need like another, I guess, like time job to hit this API for new products so that we can update our relational DB, I'm sure. There's like, those are new products. Scheduled job. And this doesn't exist, right, it hits API gets new products. I mean, depending on exactly how the API works, you might have to, like, send it this stuff already have. But this job, you know, it's job, I guess, is to make sure that the relational DB have the info so that it stays up to date too. And I think this is most of it. There is some, like I said, I guess like, specifics you can make about you know, for example, this queue, like we want to like return an error if it's down. Do you want like a wrapper around a, like a, like a server in front of it? Yeah, that kind of stuff.
Dialectic Singularity: What will happen, then if your if your cache disappeared? Let's say that someone didn't know what they're doing. And they broke your cache.
Fabled Goblin: The Redis cache?
Dialectic Singularity: Yeah. They deleted it. Yeah. Yeah. What would happen?
Fabled Goblin: So in in this instance of, so that's why the batch job also wrote the rankings to the relational database. Because if your cache goes down, then you know, one, we don't want to have to run the batch job again. So we can just like repopulate the cache using the rankings database. And two, then obviously, we need like, some way to like, return the information. So it would hit this database.
Dialectic Singularity: Okay, that makes sense. Yeah, then this is pretty good for the high level design. You might, as we were starting that you really want to spend some time talking about the back end stuff and API's and things like, would you like to go from here to going back to the writing part and actually write out your API calls, make a class called, like SalesRanker or something and define your API method calls, and lets you see what you write for your actual method calls?
Fabled Goblin: All right. Cool. So I guess we should switch to Python. Let's see. So just to be clear, we're discussing like, what the API would be for, like from the thing that's hitting the server or like my batch jobs, or like, it's all of it.
Dialectic Singularity: Um, so make a couple classes. Make like a sales ranking, a sales ranker class that would return like, all the calls that you would need to get like a product rank, and the categories and things like that. And then maybe make another class called, like a static class called maybe like, update sales rank or something, and then talk about what the API calls would be to actually update the rankings.
Fabled Goblin: Cool. Okay. So the sales ranker, it needs two methods, right? The first one is a, I guess, get ranking by product. And we can probably use some productId for this. And I would have this return a dictionary. And the other ones get ranking by... Get Products by a category. It ranks by category. This is a category ID and we'll have this return a list i think is fine. Right? Yeah. And then update sales ranker. So the thing about like sales ranker is that in my implementation, it just tries to reach my cache. And like, if that fails, it reaches with db. So update sales ranker with, uh, I guess I update the cache, I guess. So we can call this like, batch job. Much more accurate. Yeah.
Dialectic Singularity: No, that's a good point, it is a batch job.
Fabled Goblin: Yeah. Um, I would likely just have, like one method just to do both. It's like an update. I guess I can run? And what would this take? So, let's see. So probably what I would have. So I guess like, you know, like, I wouldn't... Like I obviously, I probably wouldn't pass like all this information, like through the class, I guess I probably have something like, like the redshift uri? And then we'll say like, I was like transaction uri. And then like, relational dB, or product info db uri. Probably some optional parameters, like, time to look back. That's default is like, 180. I guess days. Yeah. Yeah. And this should be like... And then the other thing I would have is, what was the other thing I guess, like number of items to rank? Default of 100 or whatever? Um, yeah, I think that's good. I guess another thing to consider is, like, when you init this, you probably have to put in some sort of like resource partitioning. It's like num workers. The other thing is, I guess, like, cadence. Start Time. Right. Yeah. The time. And then time between runs. This is a delta. Yeah. Yeah. So this is roughly the, I guess, honestly, like, the uri and stuff should probably go in here to, honestly, probably all of this just go into the init. But yeah, you basically need to, like send your config for this job. And I guess the other thing you need is a like a Redis uri.
Dialectic Singularity: That makes sense. So when you are calling, let's say that I'm looking at my sales ranker, and I want to find the top 10 items that are in the category of cell phone. How would I do that? Category is probably gonna be huge right. If it's got, let's say it could have over a million items in it. Do I really want to get the whole list?
Fabled Goblin: Yeah, yeah. Um, I guess like in this like, assumption I, like I like assumed that in this like number of items to rank, like I put 100 and inside just store 100. But you probably are right that it... Okay, so there's like a couple of ways of doing this, I guess. So one is to just store like, everything in like, just have a giant list in your cache. And then you know, that way, you can, you know, when you call get product by category, you can like, you know, query the cache and then like, just said, I get the first like, 100 or so. the weakness of that obviously, is like if you're only going to get 100 and you have like a million items, then that's like, no good. The other way to do it, I guess is like in your update, sales rank or batch job itself, determine just like how many of these you want to rank. And I guess in this instance, there's no law that says, you know, like, your ranking that you do has to be the same results that you put in your cache. Like, you know, for example, maybe for like historical reasons, you want to, like know the full rank. But to your point, maybe in your get product by category, when you return it, right, you don't like want to, like, hit everything. So I think the way I would probably do this, is have this thing called, like, you know, numbers of items to I guess, like, rank in cash. And then I have the store number be stored 100. And I think it makes sense to have this, be the, like, I guess the, the way you limit the number of results is just like what's in the cache, because, you know, like, the other way where you like, store everything, and then just like, return the hundred first foremost thing, it's like, return everything from the cache. I don't know, if there's a way that you can just like, you know, only query the first, like, 100 results in the list or whatever in the key value store. I guess one thing we could do, if we think like, this number of like, return to want to get this kind of dynamic, is to store it in chunks, by which I mean, you know, we can have a separate key, so let's do this, like, you know, Redis key. It's like a Redis class. So we could have, like, you know, a key equal to like, I guess, like, product ID. And then I don't know, like, num start. And then like, rank and right. So rank, right? Because, you know, for something like this. So for example, if you want, like, you know, and then we we store this my chunks of like 100. And so the basic idea behind this is this, obviously, gives you a lot more granularity. I don't know exactly how the Amazon kind of get products by category works. But to your point, if there's like a million, maybe there's just someone who's like, really gung ho about like seeing all of them, maybe if they, like, just keep pressing next, then like, they can get them. So in this instance, if we do that, then yeah, and by the category Id, you can also get like, you know, rank start. And here we go rank start. Do we need that for product id? I guess, you don't need to be like, rank end, I guess, would be this, this is kind of roughly like, how popular it should be before you stop getting it. So in this case, you know, if one category ID, we can do pagination, where you like do in chunks of 100. And like, if someone really wants to get like, a lot of them, you would, like just keep paginating. In this instance, right, you wouldn't be able to get like, you know, like a million at once. But you only return like, you know, the kind of key rank, but perhaps getting a million at once is a not a necessary use case to support?
Dialectic Singularity: Yeah, cool. I think a million at once might be kind of a kind of an edge case, especially when you're looking at items, you generally don't see more than one or maybe a couple hundred items on a page. So needing a million would be expensive and probably only used for data analytics.
Fabled Goblin: Yeah. So yeah, I think that does get flagged though, that on the product category, we should like, you know, have some sort of pagination. And the second thing is get ranking by product. So I think the equivalent for that is like if you get a ranking by product, like how popular it is, I guess until you just like, I guess don't show it. I don't know exactly how Amazon does this. And you know, I don't know if maybe like more popular products, or more popular categories you want like a higher I guess like absolute number of like ranks, like using 1000th phone is a lot more impressive than being the 1000th sprinkler? I don't know. Um, but yeah, uh, yeah. So but for this one, this is, uh, you know, slightly harder to paginate I guess. Because you're in the like, you know, this would be like category key, because you'd need like product key to go to something similar, but you have like, you need like basic repeat information because you only hit it once. And until you like, want to rank into like 10,000. You were like everything before. So I don't think... I mean, I don't think there's much to be gained by having like a dynamic, like rank in for products within categories. What do you think?
Dialectic Singularity: I think that you're right, there's not a ton to begin with that. The biggest issue that I had originally was the block pulling a huge amount of data one, so I'm glad that you addressed that. Yeah. Um, but yeah, in terms of the other one, I don't think it does. What would happen if someone wanted to add an item that already had some sort of ranking information? Do you think that would be something that you'd want to support? Let's say that you were selling an item through some sort of third party store, and you wanted to also sell it on Amazon? And I, as a vendor said, Well, look, I have all this transaction data from when I sold this item for the past year. Can you use that to preset my rank?
Fabled Goblin: Right. Um, so the way that would be supported now is, you know, you would put your information in your queue, and then your ETL batch job with them, like write it to your transaction database, like... So I'm assuming like, you know, your product info is the same. And we just have this like, new data that we have to incorporate into the transaction history. So in that instance, you're, like, ETL job from your queue would write to transaction history. And it is a time series database. So when you like, write that in bulk, you wouldn't have to resort it. But you know, since we write it in bulk, that should be okay.
Dialectic Singularity: All right. See, so then you, you would just rewrite it, and then you recalculate it from the data that got brought in, and then it would just have a rank the next time you updated the cache?
Fabled Goblin: Yeah, I'm definitely more for item potency than updating.
Dialectic Singularity: Yeah, I think that is a good level of important point, whatever it is, you have to do. So I think that's the better one. Okay, so we're starting to look pretty good. Let's see. So I'm trying to think of like a... we have about 15 more minutes, generally, of the of the interview, so I'm trying to think of... Now let's say you've got this service, it's working pretty well. But all of a sudden, Amazon decides to start selling... Amazon buys Alibaba. And now there's, like 30 times the number of categories. And now things are selling 10 million products, or 100 million products. And now it's a hugely huger. How would you get this to be bigger? So now you no longer, let's just say it's big enough now that you can't fit everything inside of a Redis Cache.
Fabled Goblin: Right.
Dialectic Singularity: How would you go about re going through design to remove the Redis Cache, but still have some sort of caching for fast utilization, but now knowing it's too big to store everything in a single cache?
Fabled Goblin: Yeah, um, so. So it's too big for one Redis Cache, right? Can be can be like distribute our Redis. Can we get like more caches? And then I guess, like, partitioned them based on like, product ID.
Dialectic Singularity: I'm sure, I suppose you could get multiple Redis caches and partition them based off of product ID so like, what would you do? How would you do that? How would you partition them based on the product ID. How do you calculate the sharing?
Fabled Goblin: Yeah, yeah. So right. So this has two keys as category key and product key. And then so the, I guess, like, most obvious way is just like a hash of like category key. So I guess. So this is, let me let me like not make suggestions until I can understand the situation better. So I guess like with sharding strategies, I guess like the thing you should always keep in mind is, uh, I guess like, one, you know, like, does it, is it like effective in the sense that, you know, do you ever,like... because sharding doesn't do anything if you have to, like get everything. So you know is your like, do you have some sort of like natural partition key to shard it by, let me write this down. Like for sharding natural partition key. And the other thing to consider for sharding is, it was the other thing to consider is, whether or not your load is, I guess, evenly distributed. You know, always I mean, you don't want one to be like, completely dead, and then one to be hit all the time. So, I think for the first condition, we're pretty good. You know, given that we, each time we hit it, we just need either one category ID, you know, and I get to rank start, or one product ID it's like very natural to shard it, you know, using, let's say, the hash of like, the product ID and the category ID. And you know, even if you go round robin from that, you probably like, it's unlikely you like, ever, like, your function code never need to hit more than one of these at once. So that's fine. The second part of sharding, though, is that, you know, whether or not you'd load evenly distributed, and this condition gets a lot dicier here. Because if you write just once again, you know, take hash, modulo 10, then there's absolutely no guarantee that, that that will have equal load. Um, so I guess one thing, one way around that, I guess so I haven't worked too much in sharding. That is I can just come to indecision. Let me say one way around, it is like, you could try being a bit smarter about your partitioning by so either like.... Yeah, so you have to do just like I think up front? Well, I usually don't have to do it upfront, you might have to do it dynamically. So I guess like, the first time you do it, you know, you were like he had to be like kind of dumb. But I guess like the second time you do it, you know. So what would be nice would be an ideal solution?
Dialectic Singularity: An ideal solution would be something that would, they would break it up with individual products, individual products IDs, and then dynamically placed the product ID into the individual instances that were less full. Yeah, so.
Fabled Goblin: So just to be clear, like, this is a question I was asking myself.
Dialectic Singularity: I was like, Oh, I guess?
Fabled Goblin: Yeah, no, yeah, no worries. But yeah, you're right. So what would be ideal is if there was some, right dynamic allocation where, you know, like, the code just did it. So um, so what we're actually looking for is the number of like, like queries, this gets hit, right? Like the actual number of products is relevant. And even the number of actual sold is mostly relevant. So that might be a good proxy. What we actually care about is how often this API gets hit with this category ID and product ID. So I think the ideal solution would be if we could somehow, you know, have a... I guess like updating numbers for how often this API gets hit with a various product ID, category ID. And we could use that... then we could use that count to I guess, like, I don't know, dynamics to work, but yeah, I guess like, you know, I say snake order. I put into Redis by a product and you know, category. And that obviously, wouldn't be like, you know, 100% good, but it's probably pretty good. Um, yeah, so I guess the question is, right, how do we get this API with various like product category IDs? Um, so I guess like, you know, one way is, you know, given we have this, like, transaction history of like, we could use a method similar to our transaction history. The only thing about this like that to your point, you like call, like, people look at stuff a lot more than they buy it. And then and then so like, that seems like a lot of work just to like dynamically update this Redis Cache. Yeah, so I don't know, I'd be kind of using like a, like a hammer to kill a fly kind of thing. Um, is there isn't another way we could like, roughly know, how to, like, allocate this without, without having to count. Um, is this like, a? Yeah, it's just like, yeah, it just like number of like views of this system that like, we could like, know, without having to like, add this other thing? Like, basically, like, we already have, like, Splunk logs, I can like, we can get this from.
Dialectic Singularity: Sure. Yeah, you could, if you wanted to have Splunk logs, you can have Splunk logs of every single time that it was that it was viewed. But that would be creating a huge amount of additional storage and logs that I don't think is necessarily required. And also Splunk it's very expensive when you store lots of things.
Fabled Goblin: Yeah, um, let's see. Yes, that doesn't that doesn't sound great. Um, is there like a rough relationship, I guess, between like, How often? Is there a rough relationship between how often stuff is bought? And how often? Yeah, versus how often it's looked at?
Dialectic Singularity: Right, yeah, we talked about that earlier, like,half an hour ago, and I said, it's about a one to 100 relationship.
Fabled Goblin: And I guess, like, my main question, though, is this like, a, like it's a one to 100 for every product? Or is it like, are some products like a million to one and some, like one to one?
Dialectic Singularity: So it's not evenly distributed, but overall, is that ratio. Yeah, cuz if there is a million dollar yacht on Amazon clearly is going to be looked at many, many times. And it's only going to be bought once. But if someone is selling on a printer paper is probably going to be bought more often than more often than one to one hundred because no one browses printer related, people browse, you know, million dollar electronics?
Fabled Goblin: Yeah. Yeah, that makes sense. Um, so I guess we already have a table of how often stuff is bought. And, you know, let's say like, like looks to, like buy ratio is relatively static, or product. You know, if that's true, then given we already know, how often stuff is bought from our batch job. And we can take that, and, you know, multiply by our looks to buy ratio, to get a rough estimate. Of looks right?
Dialectic Singularity: Yeah, I think that's a thing you can do.
Fabled Goblin: Yeah. So, you know, this will be much less computationally intensive. And then using disk, we could use this as like, I guess, a rough heuristic for how to divvy up our data among the Redis caches. And then, you know, like, obviously, from that we can still we can, we can like, look at like load as time goes on. And if we see there's nothing wrong with this. But as like a like v zero way to like dynamically allocated, I think this seems okay.
Dialectic Singularity: I think that that seems not too bad. Also, right now, it looks like we're about to approach the 50 minute mark. I like to save the last 10 minutes of these interviews or, I guess, five to 10 minutes that we have left for feedback both for you and me. So if we jumped into the kind of the feedback stage of this, where we talk about how you did and things you could improve on things you did well, that kind of stuff.
Fabled Goblin: Yeah, let's let's do that.
Dialectic Singularity: Okay, so the very first thing that I look for when I'm doing a system design and I asked to make sure you understand the question is clarifying questions, you asked a few clarifying questions. But there were some items that you didn't quite ask. And since you're a little bit stumped I told you about, like the ratio of reads to writes, and things like that. And the size of the data.
Fabled Goblin: Got it, yeah.
Dialectic Singularity: And it's really, really important when you're doing one of these interviews, to just ask a lot of questions in the beginning, because in system design, you've mentioned that you've done some design in the professional workplace, well, if you're designing anything, you're going to have like a five hour conversation with product managers about every single aspect of the design. So it's really important to ask lots of questions and make sure that there's no miscommunications or missed assumptions.
Fabled Goblin: Right.
Dialectic Singularity: The next thing is use cases, we talked about use cases pretty much as a kind of how the service calculates it, the user views, should be able to view categories or products. So that was pretty good. We also didn't quite talk about scope that much, I told you that I don't want to talk about the ecommerce or the frontend side. But what I was trying to hint at but it probably got missed, and this is my bad, is I wanted to start from a specific location when you're doing your high level design. So I like to scope to like start and end with a concrete location, when you were starting your high level design, you actually started off like as your kind of as your starting point with a load balancer. And I... it kind of caught me off guard, I haven't seen anyone do that. Generally, what I recommend for starting point, because you don't know where that came from with a load balancer. And the load balancers are generally something that is just kind of thrown in as like a middle step to prevent overloading. So I think it's, this is kind of a little thing, but it's something that I notice every single time, is you have to start at a concrete spot. So you could start as like the client making a call to you, or start with the client or start with the frontend or start with start with something that's like a specific concrete location, because a load bouncer is inherently a middleman.
Fabled Goblin: Okay.
Dialectic Singularity: It doesn't generate a new request. A load balancer is, you create a request and you send it to the load balancer to go to where it needs to go. So let's start with where the request is actually being generated. I think that's really important.
Fabled Goblin: Yeah, that's a good idea. So in this instance, specifically, I guess, I could just draw like now at the very start with a client, you know, I guess it's like the frontend and just like makes a request. And then the next step is the request goes to a load balancer and goes to a server, etc. Cool.
Dialectic Singularity: Um, the other thing is draw arrows or lines connecting things. When you were talking about it, I was able to follow it reasonably well. But without that, I mean, you just have nine boxes, or eight boxes?
Fabled Goblin: Yeah.
Dialectic Singularity: I can count. Without arrows connecting them and talking about their relationship or how they interact, looking at this without the direct contacts of talking to you makes it really difficult to follow your high level design. I understand not having labels and stuff. But it just, I think it's important to have direction.
Fabled Goblin: Yeah, yeah, for sure.
Dialectic Singularity: The next thing that I would say that you didn't really do is calculate usage. We just kind of looked at how much the requests are, how many categories and items and all that will just fit in Redis. But we didn't actually calculate, like, the usage of like how big of the transaction was being sent to your database, like, how much data is being sent to your servers every month? For example, let's say, transaction is 40 bytes, and you have 40 billion bytes. And that's 40 gigabytes transferred to your service every single month, but like how much you're going to store, that leads to sharding, and lots of other things. And a lot of the design comes into how much information is actually coming through. I think it's really important to take that extra minute or two to talk about what the usage is going to be. Yeah. And then that way, you can also figure out bandwidth. A single instance of Redis may be really, really fast. But if you are having a whole ton of concurrent connections trying to read from it, it's not going to work. Right?
Fabled Goblin: Yeah, that's a super big flag. Thanks.
Dialectic Singularity: The other thing is, I thought it was interesting how you were storing all of your pre processed rankings within your database. Since they're already preprocessed, it seemed weird to store them within that database and make lots of rows for them in like separate tables, wouldn't it have been easier just to store them maybe in an object store? That way you can archive when you put it in s3 or something?
Fabled Goblin: Oh, yeah, I guess, uh, sorry, by object store, you mean just like not a database like s3?
Dialectic Singularity: Yeah, since you've got an already pre processed and everything, you can just export it somewhere that way you have archives of this stuff, in case something got lost, you know, it was more secure than keeping it inside of a NoSQL database where things get changed up.
Fabled Goblin: Yeah.
Dialectic Singularity: It's typically seen in the archives of it stored within the actual database with the raw data. It seemed a little bit off.
Fabled Goblin: Yeah, I think that's good for like, older data. I guess the reason I put it in a database is because, you know, if the cache goes down, if you want, like, you know, fairly quick retrieval, then it's helpful to like, have, I guess, like a backup database. But yeah, that's something true that for like, well, I guess it's also just like, Yeah, but like, for stuff way in the past, you don't need it in this database. You know, even if you like, want to, like, look at later, you can, like do something else with it. But yeah, it's not totally necessary to store like your, like history in a database, especially for like prod queries.
Dialectic Singularity: Okay, cool. So I guess that's basically the the major points of feedback that I wanted to give you. When it came to writing side where you're writing your API calls, I mean, it all looked fine. It looks like you have some experience bringing backend API's from your actual work. I think it shows. If that's not true, and you made it up on the spot, you had me fooled. So that's all you need to do to pass the interview as well.
Fabled Goblin: Sweet. Cool. Yeah. That was a really good point about the, oh, you know, Redis is like 10 million, like, we're fine. I should have done some number crunching like, yeah, what's the TPS, how many queries like, how many like, things of like redis do we need? Yeah, thanks a lot for that. Um, the second thing I want to like, one thing I want to ask about is, so you said in the beginning, um, like, I should have asked some like, clarifying questions. I got, like, read to write and about the data, I guess, like I was gonna do that. So I think if I remember correctly, like the interview kind of went like, okay, so I just start, like asking about scope. And then I guess like, after we kind of asked about scope up like, offered, like Amazon's like, the the scale to me, I guess like my rough order was going to be like, okay, like, make sure I got the use case, like, and then get the scale. So I was going to ask eventually, like, do you think it's like helpful to just like, immediately ask about scale, just so they like, know I will ask about scale?
Dialectic Singularity: Well, I think it's good to ask about scale before you start diving into the high level design. Most design interviews, they expect you to ask about scale, and then do some sort of usage calculation before you go into the high level design. And you were really quick to jump into the drawing page. I think that you could have taken a step back, talked about those things in a little bit more detail so that you'd know them. That way you don't have to go back and ask questions two thirds of the way through the system design, because when you're going when you reach to the halfway point, it's kind of expected that you already have all the data that you need. So when you're going back through and ask them questions about usage and stuff, like well, maybe you should have asked that before you design it. Because maybe there was one thing that you didn't expect. And then your design is broken.
Fabled Goblin: Yeah, yeah, for sure. For sure. Uh, okay, cool. Yeah, I'll definitely be more cognizant about asking about, like, scale and numbers and like, considering that more.
Dialectic Singularity: Yes. Sweet. Yeah, that sounds great.
Fabled Goblin: Cool, I'm glad that you liked it, and then, I know we're just about out of time, but generally, when I finish these interviews off, I'm always trying to improve my interviewing skills and make sure that people have good experiences when they interview. Um, do you have any feedback that would help me become a better interviewer for system design?
Dialectic Singularity: Yeah, um, not really. Like that was actually a really good. You know, like, looked at my like, background and stuff. So you get like, specify this to me. And then like, once that happened, you kind of like, I guess, like, asked me, you know, like, what I'm looking for? And I did that. Uh, no, I honestly don't have any feedback.
Fabled Goblin: Okay. Cool. I'm glad that you've had a good experience and that you got value out of this. I had a good time too. I enjoy asking these questions. I have have a small list of system design questions that I ask people depending on what they're looking for.
Dialectic Singularity: Yeah, let's hope I get this in an interview tomorrow.
Fabled Goblin: If you get this exact question in the interview tomorrow, that would be sweet.
Dialectic Singularity: Yeah, that would be very fortuitous.
Fabled Goblin: All right. Well, I'm going to end the interview now. Thank you for taking the time and best of luck tomorrow.
Dialectic Singularity: Yeah, yeah. Thank you. That's great. All right, have a good one.

We know exactly what to do and say to get the company, title, and salary you want.

Interview prep and job hunting are chaos and pain. We can help. Really.