Interview Transcript
Adequate Goose: We can get started. Would you like to tell me a bit more about your background and what you do?
Aerodynamic Tornado: Yeah, sure. Are you able to hear me still?
Adequate Goose: Yes, I am speaking to you.
Aerodynamic Tornado: Okay. Yeah.
Adequate Goose: If you think it's anonymous, but usually I'm trying to keep it as close to the real life interview as possible. So if you feel like sharing that portion, that's okay. Or if you want to go straight to the system design interview, that's okay too. Your call.
Aerodynamic Tornado: Okay. Yeah, we actually connected last week I did a system design mentors session with you. So I'm just back for a practice interview this time. So I work as an SRE and I am looking to do or I'm practicing interviews for big companies. I'm targeting airbnb. I have interviewed with them in a few weeks. So yeah, hoping to get some practice and feedback to be prepared.
Adequate Goose: Okay, awesome. I have a good question for you. All right, so let's get started. Sure. I remember we spoke a couple last week. Okay. All right, so today we will be designing an RSS newsfeed. I will go over the description and then I will type down the key points and then we can start from there. We are creating a news platform where the users can log in, pick their favorite news sources and then get a customized newsfeed from those sources. As an example, you can take a look at this link, the RSS feed link, to familiarize yourself with what this looks like. This is the link. When you click on the link, you'll be able to see that. You'll be able to see what RSS actually is, RSS Aggregator actually is. And you will see such as Top Stories, World, US and all of that. So when you click on any of the individual URLs, you will see an XML page that is generated. So over here just collapse most of the XML headings. So that way the essential parts over here are title, description, image, copyright and different items. And each item is a news article. So keeping this in mind, we want to do the following. When the user logs in to the app sorry, they can choose ten new sites that they want to subscribe. And this is the first time. This is called the cold start. So once they do a cold start, they can subscribe to ten different news sites. And once they select that, they should be able to see the news on their feed. And they're also able to edit the sites that is, delete any of the existing sites that they have and add on a new one. But the limit is ten. So the challenge for you is to design a system where this is possible. Think about how we'd like to store the user choices, how we'd like to gather and show news from the RSS feed and how we might handle tons of users doing this simultaneously. I think these portions, you can correlate it to airbnb the listings that we have, right? So maybe you can correlate to that because this would be very similar to that.
Aerodynamic Tornado: Okay, so just trying to understand the question here. So we are trying to design the RSS fee which is similar to the link that you shared, is that correct?
Adequate Goose: The link that I shared is just to give you an example of what the RSS feed looks like, right? That's an example of what RSS feed is and when you are thinking of designing it, we can discuss right. What kind of application are we trying to design? Is it mobile based, browser based and so on.
Aerodynamic Tornado: Okay, so the so the user so there is an RSS feed URL that the user selects and then they are selecting. So this can be like is this going to a point like on a browser or a mobile app or.
Adequate Goose: Tablet? Yeah, sorry, I didn't clarify that. Sorry. The reason why I shared that link is for you to understand what the structure of the RSS feed is to see. Okay. Each RSS feed, it has a topic and it has images, it has a title and different items. So think of that as a model that you need to have, right? So that's what you're looking for. But you can think of designing something like an Instagram, for example, where you have different people's posts, right? But instead of posts you can think of it as a simple RSS feed where you have a Flipboard used to be an example. If you have used it, it is subscribing to news resources and then you log on to Flipboard. You will see different kinds news based on your interest.
Aerodynamic Tornado: So mostly a mobile based application based on taking the example of an Instagram feed where the user logs in and then they choose who to follow. Or in this case they will select or subscribe to ten different types of news topics.
Adequate Goose: Exactly.
Aerodynamic Tornado: And based on those topics the news feed will be generated and the newsfeed should look like should the newsfeed look like something like Instagram? Or it should look like the RSS reader link that you shared, the CNN link.
Adequate Goose: So that we can discuss we can discuss to see what that needs to look like. So that could be discussion point for us.
Aerodynamic Tornado: Okay. And then user can also edit or delete any subscription. But they need to have at least ten subscriptions. Okay?
Adequate Goose: Yeah. So I can clarify your question about what that link should look like. So if it's an application, we can't provide an XML file when they open an application, right? Because that is not humanly readable. So we want the finished product where the news articles could just be headlines or they could be the reason why I shared that the RSS. So when you click on the top stories link, for example, the Copy URL to RSS reader link, you will see the different attributes that are there. So we will be rendering attributes, right? Like we will be rendering each of the item as something. Each item has associated link. It has a publishing date title. So we will be publishing the actual data on the UI and not the okay.
Aerodynamic Tornado: Okay, I think it makes sense now. Like the question. Okay, then I can gather some more information about the usage of this application. Application. Right. So you mentioned that the users it's probably going to be like a mobile application and then are the users okay, so are the users going to how many users, I guess are we expecting to use this application?
Adequate Goose: I think we can start off with maybe 100,000 users for now and then later on as you progress through the system design, we can talk about scalability.
Aerodynamic Tornado: Okay. And when we are designing the new feed, are we are we like ordering them in a certain way? Even within the ten subscription, is there going to be five topics is like the user's favorite, whereas the other five topics are more like something the user is passively interested in. Is there some sort of ordering that we should consider when designing the news feed?
Adequate Goose: Yeah, that's a great point actually. So we want the latest news to come first. Right. So this can also branch to a topic of its own, which is trending. Right. Because if there's a breaking news, possibly all news channels would be posting about the same things. So we could get to that as well as to what is the content that you want to share. But definitely we want to organize it in a way that the latest news is always on top as a refresh.
Aerodynamic Tornado: Okay, and then is the user also going to be able to share these news articles?
Adequate Goose: Yeah, at this point that's out of scope.
Aerodynamic Tornado: Okay. Are we also going to design like how the user is going to log in, pick up the ten subscription or.
Adequate Goose: Just yeah, I think you don't have to worry about the authentication part of it and the login part of it. I think just assume that things are already done for you and you're either registering or either logging in directly. You don't have to worry about the underlying.
Aerodynamic Tornado: So just focus on building the newsfeed from the newsfeed. From the subscriptions. Correct subscriptions. Okay. So I think regarding to latency in how often we want to update the news feed, is there a number or any ballpark that you had in mind?
Adequate Goose: Yeah, so when a person is already scrolling on the newsfeed, we can discuss what kind of scroll that we want to have. There are two different kinds, but whenever the user logs in for the first time. They will not see any of the older articles. They will see a completely new article and how often it is going to be refreshed. You can think of it that it refreshes every three to 5 seconds.
Aerodynamic Tornado: Three to 5 seconds. Okay. So the newsfeed refreshes every three to 5 seconds. But the first time they're logging in the newsfeed should be refreshed already. Okay. And then it and we okay, like any downtime that we should be any leeway in downtime that we should be looking for like basically availability of the application.
Adequate Goose: Yeah, we definitely want to be highly available in this case. So the 4999.9 highly available. Okay. Assistance and availability. We will prioritize availability.
Aerodynamic Tornado: Okay. And the scalability will be discussed later. Okay.
Adequate Goose: Yeah. As you progress and we build a skeleton out then we can think about scalability and how do we since specifically it's for SRE position, we can think about performance and what can deteriorate and how do we add metrics and checkpoints and all of that.
Aerodynamic Tornado: Okay. I think this is like a good starting point. Is there anything else that I am missing that I should consider?
Adequate Goose: No, I think you got we can move on to functional requirements.
Aerodynamic Tornado: Okay. So functional requirements wise, I think we want the new seed to be generated when the user logs in, logs in for the first time and then after the user has yeah, after the user has spent some time on using or going through the newsfeed should be or the newsfeed should refresh every like three to 5 seconds. So new speed should auto refresh and then so if the user is in.
Adequate Goose: The middle of reading an article we don't want to refresh it at that point and then make the user article. Right, but if the user is maybe just scrolling.
Aerodynamic Tornado: Yeah, I think in Instagram I know that when you're scrolling, you're scrolling down and then you kind of want to refresh your feed. You scroll all the way up and then you scroll down again and then it has that loading button and then it refreshes the feed for you.
Adequate Goose: Exactly. Push the update.
Aerodynamic Tornado: Or do we want the user pull.
Adequate Goose: The updates so you can think about that?
Aerodynamic Tornado: Okay, so let me just rephrase them. New seed refresh periodically and then like we said, your push or pull method and then when the user clicks on the news it should load the title, the images and other details related to the news. Okay. Actually one more thing I forgot to ask. Should load the news details like text, images, URLs, et cetera. Is the user going to be able to favorite any news? Is that something we should consider in this design?
Adequate Goose: We can build upon that as well.
Aerodynamic Tornado: Okay.
Adequate Goose: That is out of scope.
Aerodynamic Tornado: Okay, so newsfeed generation, newsfeed refresh and then clicking on a news article and then think the non functional requirements is like we want this system to be highly available.
Adequate Goose: Yeah, I think you've covered the non.
Aerodynamic Tornado: Quite okay, so I think I'll just use that. Okay, yeah. Okay. So yeah, then we can dig into the API design for this application. So I'm thinking that given that we already have the user information and their subscription information, so let's say subscription ID, we need to build a new feed based on that. So the first thing I'm thinking is we need to use the subscription ID and get the news articles related to all the subscriptions. So we probably need something sorry, by.
Adequate Goose: Subscription ID you mean like the individual news? Like CNN will be one subscription ID, BPC will be a subscription ID. Do you mean that's? What is the subscription ID.
Aerodynamic Tornado: Okay, so the subscription ID I was thinking is like the topics that they are subscribed have subscribed the ten topics they subscribe to. So it can be like whatever, top news, sports, et cetera, et cetera, those kind of topics. And that will probably be like a JSON which will have the subscription ID is what I was thinking.
Adequate Goose: Okay, I think the context of this question is that we won't be subscribing to individual topics because that could be like the next step. But for now we would be subscribing to ten news channels, right? Because if we choose topics then the scope of the question changes drastically, right, because where they get all the news channels across the whole world for a particular topic, right? Like if someone chooses then we are not limiting it from any of the news resources, we're just choosing sports. So now that becomes a question, like a second layer of a question as to if someone chooses a topic, what will be the source of that expands the question to a very different person. We can assume that the user can only choose the new site, whatever that new site is publishing. The user is able to see there is no personalization at this stage, but we could add on personalization in the next stage. Like once you have your blog diagram or something then we can start thinking about different directions.
Aerodynamic Tornado: Okay, yeah, thank you for clarifying that. So then in this case that I'm thinking the subscription ID would still be like the topics within that news channel. So like in this example in CNN I see there are the top news sorry, top stories, World, US.
Adequate Goose: Et cetera.
Aerodynamic Tornado: So whatever the user is, the ten topics that they are selecting at the time of registering to this application, it's saving them as something like a JSON which will have these ten items and then based on that we can get the articles from that. So once we have the topics, I guess within for the user, then all we need to do is let's say.
Adequate Goose: Get.
Aerodynamic Tornado: News, get news articles for a user and for a given I guess subscribed topic, subscribed topic and like a timestamp I think because we want to get the latest news at the top. And also we want the latest news. So it's always good to have the latest articles when the user is requesting or when we are generating the feed. And then yeah, so this should get the news articles from that, the RSS URL. And then once we have that, we need to render the news. So this will return this will return the RSS URL and then render actually, before we render, we I think need to order the news. So order the news articles. This will be like parsing the URLs and getting the news by the Timestamp order. And this will return all the news in order. And then once we have the news in order, we can render news feed which will take all our news in order. And then this will be what the user actually sees on their feed.
Adequate Goose: Okay. Render underscore news feed. Since we are associating the feed with each user, I think it'd be good to have the user ID.
Aerodynamic Tornado: User ID? Yes. So user ID and news will be rendered. Does this look and then refresh? And then refresh newsfeed. This will basically be like depending on the news model or sorry, the model that we're using, push or pull. This will basically have the user ID and then it will basically do all this process again. Actually, we don't need another API. I think just whenever the request is made, it will just go through these three APIs and render the news again. Because it's going to be on demand.
Adequate Goose: Correct.
Aerodynamic Tornado: Okay.
Adequate Goose: Yeah. Because we can think of a cron job or something which will be running in the background. Right. Like when a person logs in. Because we actually don't want to waste resources by constantly refreshing because we want to wait for the user's action before we refresh particular user ID.
Aerodynamic Tornado: Yeah. Does this look good so far?
Adequate Goose: Yeah. Could you a little bit about.
Aerodynamic Tornado: When.
Adequate Goose: The user is selecting the oh, yeah.
Aerodynamic Tornado: I missed that part. So when the user selects news, we want to render the news article, I guess, like render news article. And this will yeah, we will still want the user ID and then the news article, I think news article, URL or ID. And this will be like the text, the images and what else is there? Oh, sorry, the media content, the link and the title. So media content, link, title. And those are like the most of the data that I see in the RSS.
Adequate Goose: So wouldn't this the render news article, it looks very similar to the Get News article. Right. So can we not add in the article properties inside the Get News article itself? Because the Get News article for a particular user and a subscribed topic at a time you are returning the RSS URL, the question that I mean to ask is the rental news article. You're providing the response essentially inside that function whereas things like these will be the response for a particular function. Right, okay, I see when you call get news article you're going to pass in the logged in user ID and a subscribe topic and a timestamp. So this would be a request. Your response is probably going to be that user ID, the new particle ID.
Aerodynamic Tornado: Okay, I see what you mean. Okay, yeah. I think that makes more sense. Okay, yeah.
Adequate Goose: So perhaps you can think about how will the user select their news feed so we can discuss that.
Aerodynamic Tornado: So when they are selecting the newsfeed, it's loading these.
Adequate Goose: Items on the news channel. I think it'll just be like select new city source and then there will be a user ID and then a list of news sites that they want to select.
Aerodynamic Tornado: Okay, okay, yeah, I see what you mean. Okay.
Adequate Goose: Probably be like the step number one and then the other.
Aerodynamic Tornado: Okay, yeah, I think this makes sense. Okay.
Adequate Goose: We can move on to the blog.
Aerodynamic Tornado: Program or yeah, I'm going to switch to the toggle the whiteboard. Let's see. Yeah. So we so this is the user like the client app 1 second. All right. Yeah. So then the user is we'll send a request basically through a load balancer that will load balance the traffic between the back end. So this will be our main service. So we can call this the okay, so I think we will need one service that is gen, one service that is collecting the feed and the other service that is.
Adequate Goose: Or.
Aerodynamic Tornado: Pushing the feed or generating the feed and giving it back to the user. So this will be service that only takes the user request.
Adequate Goose: Okay. You had connected the load balancers to the app server. Right. So that sounds like and each app server will probably have its own APIs that you just shared.
Aerodynamic Tornado: Yeah, let me just go back. Okay. Yeah. So let me copy the APIs again in the whiteboard. Oh, okay. Yeah. So yeah, I'm thinking like the get news articles and order news articles will be like one service. Whereas rendering the news articles and the news feed will be like another service. So this is the news feed generator.
Adequate Goose: Actually if you were to take one step back to see the broader perspective. So all these different calls, right, to get news articles or other news articles, render news, select news, they all are basically API calls. Right. So at system level design we probably don't need to go into the details of the individual APIs because I think you are focused on the system design overall. Right. So we can simply turn the boxes next to balancer as service, which will probably have all the other.
Aerodynamic Tornado: So this will be the RSS feed service and for simplicity, this is servicing our APIs for now. And then basically we are also storing the user information and the subscribe topic in a database. So that will have the user. In this case, since we have information mostly about the user ID and subscription, I'm going to go with I think a no SQL sorry, relational database makes more sense because currently we are only storing this information. Something like MySQL or a postgres will be good to begin with and then the RSS feed service is then going to generate a news feedback into the client app. So here I'm thinking that the the rendering of the news itself okay, yeah, sorry, before I go into that, this is a high level diagram. I was thinking is there something that you would want to go into before I go into the point that I was trying to get to?
Adequate Goose: Yeah, so I actually wanted to go a little bit deeper into the database. Right. I think here I want to ask you some SRE kind of questions. So over here when you're doing the rights say that the user has selected their new sites and they're doing the writes onto the database and for whatever reason in the middle of a particular write the database server comes down. How would you ensure that it is still highly available and will that write happen once the user is in the middle of the process?
Aerodynamic Tornado: Yeah, usually in database or especially like a relational database, it's good to have a single source where you write and then replicate that to other database for availability. So in this case I think it will be good to have just like a master follower model where the feed service will write the data to the master and then the data is replicated to the followers. And then another model that can be done is also have two masters but one master be like the primary master and the other master be like a cold standby. So in that case there would probably be in that case also if the primary master fails then the secondary master will automatically take up the role of the primary and usually in cold standby or a master follower design the data gets replicated within milliseconds so that will ensure that the data is available. If the database dies in the middle of a write then we can have retries from the service itself. So usually when the writes to the database fails there is a failure of acknowledgment back to the service so the service knows that the data was not persisted to the database. And then usually in the service, usually the database setup is such that like let's say it's a master follower or master cold standby kind of a setup. The database usually has like a C names and it has multiple IP addresses to that CNAME and there is like a health check done. Maybe we can assume some sort of proxy service in between the service and the databases so that proxy service kind of does health checks of the databases on a per second cadence. So if the database fails that proxy will remove that database IP address from the CNAME. So the next time the feed service retries, it will go to the new master to go to the new master and it will purchase the data.
Adequate Goose: Okay, all that sounds good. So for some follow up questions from that. Right. So you mentioned about the master database. So would the master database contain the values for, in this case, 100,000 users? Would we have one database for that?
Aerodynamic Tornado: Um, I think 100,000 users is reasonable for a single master to handle. But definitely as we scale, if you're talking about millions of users, that needs to be scaled.
Adequate Goose: Could you give me an example of how we could possibly scale this news feed becomes only popular and we are quickly 2 million, right. How would we scale that?
Aerodynamic Tornado: Yeah, there are a few ways to tackle it. I think the first approach that we can take is do horizontal sharding. So basically divide the database by if there's a million users, let's say, so maybe have half of them in one database and the other half in the second database. So we can kind of split that way. Or we chart the database based on maybe, um, like username. But I think that will lead to unequal distribution of data in the databases. Excuse yeah.
Adequate Goose: Another Sargar no, go ahead, finish your thought and then I'll ask you a question.
Aerodynamic Tornado: Yeah, the other way I was thinking is like, do like yeah have like a pool of databases and they have sort of a hashing algorithm so that when we are registering the users based on the hash, the user is registered to that database. So basically have like a primary key and a hash. And based on the primary key, the hash will be or the user hash will be registered to that database. So something like that.
Adequate Goose: Okay, so by using cryptographic hashing, we can select the index database. Okay, that sounds good. So you mentioned about the choice of the database being MySQL. So in MySQL, it is like a text based data, right. We can't store the images and all of that inside MySQL. I guess we can only add links to that image, but we can't store the image or any article has a video or any other data, we can't store that. So would there be a better way of storing that information for the article?
Aerodynamic Tornado: Yeah. So for storing files sorry, images and videos, it will be better to have some sort of blob storage, something like S three, where we will actually store the files and the database will have a column that will have the S three link to that image or that video. So that way we are conserving storage or we are not storing that information in the database.
Adequate Goose: Yeah, that's good. So you also mentioned that the database will probably have some sort of a control panel where they'll keep referencing if they're currently up or if they're currently available and in case of any of the databases goes down, there's all happening in the background and then automatically the secondary database will be referenced. Right, I'm just having to ensure that I got your point. Okay, could you tell me a little bit about, say that there is breaking news. Right. So one particular news channel gets a lot of attention and say that 70% of the people are now suddenly interested on this one particular news. Right. So how will you handle a sudden influx from a particular news site?
Aerodynamic Tornado: Okay, so this news, the interest in a news will be a periodic spike, right. It's not something that is happening continuously.
Adequate Goose: Yeah, it won't be periodic. It'll just be like a spike that comes in resources available and you want to make sure that what would be.
Aerodynamic Tornado: Yeah, I think in something like situation like that it will be helpful to have some sort of caching. System here which some sort of cache between the client and the RSS feed service which will basically get updated based on the topics that are trending. So that way the users don't have to go through this service every time, but it can just directly get the data from the cache. So the one will the client the.
Adequate Goose: Cache or would this cache be better suited between the RSS feed and the database? Because what kind of cache are we thinking? Are we thinking cash aside? Cash write through? What kind of cash are we thinking?
Aerodynamic Tornado: Sorry, please complete your question.
Adequate Goose: Yeah, so for the cache, is this the right place for us to play cache between the client app directly because the load balancer is being skipped? Right. Like once the client is talking to the cache, the load balancer basically an influx of traffic and then there might be multiple RSS feed service, not just one. Right. So the load balancer is helping with that. So could there be a better place where the cache could be placed?
Aerodynamic Tornado: Yeah, so the cache can be like in between the load balancer and the RSS feed. So let me move this here. Yeah. So basically the idea behind the cache here is it will already have the images and the videos or the news articles loaded that will be quickly rendered on the client newsfeed. So that will reduce the queries that will we don't have to query the database that often and the cache will be updated based on.
Adequate Goose: Basically it will.
Aerodynamic Tornado: Be like what is it called? Right ahead. I think basically it will be written to the database first and then the cache will be updated. And then as the user request increases for a same article, the request will just be served from the cache. And then for the Eviction policy in the cache itself, we can use something like Lease, frequently used or least recently used. And then based on that, we can evict news articles which are not trending at. That moment.
Adequate Goose: Okay, so assume that the client does the Get News article call, right? Because it clicks on a particular article and now they want to open up the article. So now the Get News article call is being initiated. So when the client app calls, that API goes from the load balancer to the cache, but the call is being intercepted in between to the cache instead of it reaching the RSS speed in this case, right? So wouldn't the cache be better placed between the RSS feed service and the database that we want to wait for what call is coming through? So based on that, we are able to get the value from the cache because at this point, when the client is requesting something from the UI before the call is being made of the database where we are intercepting. So at that point, we don't know what API call is actually being used between.
Aerodynamic Tornado: Okay.
Adequate Goose: Because the load balancer, all it is doing is that it is redirecting the client app to different API servers so that it doesn't die. So then all the load balancer is doing it. It is redirecting traffic. At that point, it doesn't know exactly what API call is being called or whatever, right, it is distributing. So at this point, if we are looking into the cache, you will not know what exactly should be fetched from the cache. Right?
Aerodynamic Tornado: So I'm thinking there are like two aspects here. I guess in the case of cache. So the cache in between the RSS feed and the database will be more about getting the article based on the subscribe topic. Whereas the cache here between the RSS feed and the app is going to be more for the videos and the larger files, basically so that we don't have to pull that every time from the storage. Actually, there should be link here too. I guess this cache is more for then the user selects the news, like how that page will look. And then the cache here can be for when the user is calling the Get News articles API, we are getting the subscription or the news that we want to show the or I guess the list of news that the user will see in the newsfeed. Does that make sense?
Adequate Goose: Yes and no. So for the RSS feed to reference directly to the Blob, right, the database and the S Three are connected to each other. For example, the response for the Get News article I'm looking at, the first API. The database will have the news article ID and the media content. The media content could be a reference to the S Three service. So there is no way that the RSS service could directly talk to Swish because it doesn't have a reference to, hey, what does this media content, what ID does this belong to? Right? At that point. So, yeah, I see that you removed the connection between RSS service and the storage. Yeah, that's the right way. All right, so could you tell me a little bit more about what kind of metrics do you think would be super useful for someone from SRE to understand when something is going wrong here and at what point would you fix?
Aerodynamic Tornado: Yeah, so we are aiming for highly available system and also a good user experience. So the latency on the user side should be minimal. So targeting keeping these two in mind, availability wise, we definitely want to monitor the service of our health and by this I mean like the health of our service. Sorry, not the service of our health, the health of our RSS feed service. So here we are looking for the uptime of the service, all the uptime of the service, the host level metrics like CPU usage, memory usage, I O usage, those metrics. And then between the okay, let me cover availability first. And then between the service and the database, we want to measure the latency between the service and the database if over time we are seeing that the queries are taking longer to get from the database and that's something that we want to address using the database Sharding options that we discussed. Similarly, between the database and the cache itself, we also want to monitor the performance of the cache. Like if the cache is so let's say if the cache is storing, returning the results in the time that we expect it to, I'm assuming there is some benchmarking that we will be doing and then making sure that the cache is kind of following that benchmarking and returning request to the service feed. For the storage itself, we want to monitor how much we are utilizing the storage. If we are going to reach like 90% or 85% limit within the storage, we should think of storage expansion or maybe compression of the images to kind of make more storage space then yeah, I think availability wise, this is what I'm thinking. We can and the load balancers also we also make sure that the load balancers are healthy and the load balancers will also do health checks on the speed service as well. Whereas I'm thinking like latency wise, we want to monitor when the client is making API calls, we want to monitor the mean time to response are the users experiencing latency in that? And then we want to yeah, the news feed itself, like how fast are we generating the news or how fast the news feed is getting updated. That is like some client side metrics that we can collect and then send it back to the service so that we can analyze that and improve the client application itself. And then so I'm thinking you already.
Adequate Goose: Mentioned about the disaster recovery aspect, right? You already mentioned that.
Aerodynamic Tornado: Okay, yeah, I think that's all I had.
Adequate Goose: Okay, that sounds good. I think covered quite a bit today in the interview. I can provide some back right now and then I yeah, sure. So I think this time I think you got the flow of how the interview should be. So you were able to ask questions about what exactly is needing to be built and then you were good at taking the hints that the interviewer was sharing with you and you were able to steer yourself in the direction that is required and you were able to go through the flow quite well. The improvements that I could see is that you could describe a little bit more about the entity relationship between how these different databases are actually correlated. It doesn't have to be pretty elaborate but just like maybe adding an example of hey, we will have an article database we'll have because a lot about the availability of the database, but we did not really speak about what the database should actually contain and how the different connected with each other. So it would be good to address that. Also, you did a good job with the diagram where you shared about the client tab the load balancers and I think caching is something which you can read a little bit more about because there are different kinds of caching. So what you describe right now is cache aside, where we are doing simultaneous reads and writes to the cache and the database. But yeah, I think just understanding where all the cache could be placed would be super helpful in here and for the database. So for the MySQL database works when it is like that. But typically in applications like Instagram or Airbnb, cassandra is usually the choice of database used. When I was looking at how Airbnb stores its things, cassandra is the most popular choice. So you can why Cassandra and how is it better than MySQL for this particular thing? I think the same question correlated with Airbnb such that they also have right. You choose a location and I want only a house and whatever filters that you put and then you see all those different homes available and as people are booking the homes, that feed gets refreshed. Right. So you can correlate it similar to that. And what else? You did really well with the SRE topics. I think your day to day job, it probably makes you exercise that aspect quite a bit. So you did a good job in capturing high availability, latency performance, the throughput and resource utilization. I think one thing that I was looking for is how will you calculate the thresholds, right? And how will you that could be a nice to have to address because that's a very important part of your job. Right?
Aerodynamic Tornado: Sorry, what was the second thing you said? How to calculate threshold then yeah.
Adequate Goose: So how would you ensure what is the threshold? Right? You did say some numbers such that if the page doesn't load within 2 seconds, then we will add over there and then we will alert right. So that alerting piece was something that I was that would have been very useful to add. Okay, yeah, I think you did quite well. But I would focus on a few of these things. I think the interviewer is there to help you. So just be very keenly listening to the hints that they are providing because they want to have a discussion with you. So if you could lead some of the topics they're asking questions to me. For example, when it came to when I asked you about how will you keep it available and all of that, you led that whole conversation. Right. I wasn't interfering in that. So if you could bring that to the rest of the interview without asking clarifying questions. You are clarifying questions at the beginning, but not, for example, while you are drawing this blog tag group, you could have led a little bit more instead of looking for affirmation from my side.
Aerodynamic Tornado: Okay.
Adequate Goose: Just like for the SRE topics, you led the complete discussion. You took it everywhere. Right. So I would assume that for senior, that's what they would be expecting at different stages.
Aerodynamic Tornado: Okay, so just to be clear then, so you asked me questions about the database. What would happen if the app is trying to write to the database and it fails? So are you saying that I should cover or talk about database replication and all that stuff while I am or as I'm designing the high level design?
Adequate Goose: No, I think that you can speak about when asked, just like you did today. Okay. It can get very overwhelming for the interviewer. Also, if you're providing a lot of information up front, the way you spoke and draw the diagram was a good that also, one thing that you could add is that between the RSS feed, the RSS feed service and database, you can have like a pub sub publisher subscriber, which stores all the requests. So in case something goes, you'll know exactly where user left off. So then you can retry.
Aerodynamic Tornado: Right, okay. So sorry, just to follow up on your point where you said that for a senior I should have led more of the conversation. Is there like an example you can give about that where I didn't meet the mark?
Adequate Goose: Yeah. So let's toggle back to the plain text. Right. I think is it quite the blog diagram you were leading the way, I think, except for the caching part, which I think you got the hang of it once I was hinting you in that direction. You got that, but besides that, you led the way. So for example, I think in the API design aspect of it, I think I was providing a lot more hints in that aspect. For example, the first thing that the person does is login. Right? So the login should you could have explained it in such a way that okay, the first API that we will have is user login. Right. So this is the user ID and password or phone number. And then the response of this could just be boolean. And then the second thing would be they select the news. And then the third step would be, once they select the news, they will get the news articles. So just the flow of how that API flows through, and then in that okay. Okay.
Aerodynamic Tornado: Makes sense. Yeah. Okay. And I think overall, I probably went slower than I should have. Did you think that there was, like, one area I should have spent less time and maybe moved on to the next topic or the next point?
Adequate Goose: Yeah. So I think while you're clarifying questions right, I think if you were to time box that, I think of all the different sections, I think you spent the most time in collecting the functional, non functional environments and questions at the beginning, because, as you saw, even while we progressed, there were questions we kept discussing. You were able to get out your clarifying questions out of the way while getting to the next steps, especially the block diagram. I think it unblocked a lot of requirements right there. So I think time box to maybe, like, time box itself, like three minutes if possible, because usually we always go over time in the interview. So while practicing, if you can time box the clarifying questions to three minutes and then assuming that in the actual interview it will go to five minutes or so, that would be good. Because the main focus you want to do you want to have is on your specialty, which is getting to the dial quickly and getting to answering questions quickly, because you want to steer the interviewer to your most strongest point. And your strongest point, from what I observed today, was, I think the points flow to you effortlessly. When you spoke about high availability latency and, you know, everything that you you sounded very confident. So you want to, like, steer the interviewer to that direction quicker so you have more time to spend in time.
Aerodynamic Tornado: Okay. Makes sense. Makes sense. Yeah. I think this was a good session, and I really appreciate your feedback. That is really helpful, and I'll keep that in mind and keep practicing.
Adequate Goose: Yeah. This feedback in writing as well so that you can reference it back. And then for the next interview, go ahead and add those. Even practice it yourself a few times.
Aerodynamic Tornado: Yeah, I'll do that. Thank you so much.
Adequate Goose: Yeah, most welcome. Good luck for your interview.
Aerodynamic Tornado: Thank you so much. Have a good night.
Adequate Goose: You too. Bye.