A System Design interview with a Meta/Facebook engineer

Watch someone solve the facebook timeline service problem in an interview with a Meta/Facebook engineer and see the feedback their interviewer left them. Explore this problem and others in our library of interview replays.

Interview Summary

Problem type

Facebook Timeline Service

Interview question

Design the backend for Facebook timeline service.

Interview Feedback

Feedback about Epic Ibex (the interviewee)

Advance this person to the next round?
Thumbs upYes
How were their technical skills?
How was their problem solving ability?
What about their communication ability?
Overall: TC did almost everything(requirement gathering, API design, estimations, High level design, trade-off analysis) perfectly. Strengths: + TC gathered the functional requirements well: - TC clarified if they need to provide functionality of making the friends. - TC clarified if they need to provide functionality of looking at specific user's posts. - TC clarified that they would design for the. different surfaces/devices. + TC gathered the non-functional requirements well: - TC talked about the trade-off between availability and consistency. - TC talked about durability. - TC talked about the latency. + TC did the QPS estimations. + TC did the API design well. + TC came up with the right trade-offs between the push and pull model. + TC was able to complete the overall high level design. + TC came up with the high level components. + TC did well in database schema design. Improvement Areas: - TC could have done the storage estimation and API discussion prior to high level discussion. - TC could have done more event-driven design. General Read: 1. CQRS architecture. 2. Event Driven Architecture.

Feedback about Red Maelstrom (the interviewer)

Would you want to work with this person?
Thumbs upYes
How excited would you be to work with them?
How good were the questions?
How helpful was your interviewer in guiding you to the solution(s)?
Thanks for very actionable and useful feedback. Very helpful for my upcoming interviews.

Interview Transcript

Epic Ibexstrom: So we'll do the classic back end for Facebook timeline kind of a service. So basically where user can post things and the different users can see the post from different other users which they follow up with.\nEpic Ibex: So just the timeline is what we are concerned. Right, right. As a Facebook user there are two types of user, one set of users. As a user I could post and then people that I'm friends with end up seeing those posting on their timeline. Right, okay. Do we want to care about following friending with people or do we assume those services?\nRead Maelstrom: Yeah, that's a good question. You can assume like such service already exists.\nEpic Ibex: Okay. They're all the same users. But there are two pieces to this timeline that I'm thinking. One is like posting, like creating a new post and the other one is a passive user that opens their home page timeline and then see a bunch of stuff. Do we care about interaction with a given post, like comment or like things like that?\nRead Maelstrom: That's a good question. Maybe as a follow up, once we are done with the basic functioning we can think about it.\nEpic Ibex: Okay, yeah, that sounds good. So I'll start noting down. So some of the requirements user could create a post and all their friends see that on their timeline. We'll see there could be a delay, but yeah, we want to keep it as real time as possible. So do posts include just the text or do we want to handle multimedia also?\nRead Maelstrom: That's again a good question. Yeah, same thing. Start with the text and maybe once we are done with the basic thing we can discuss about how to introduce multimedia.\nEpic Ibex: Okay, so that I'm assuming people are coming from different platforms like desktop, mobile and things like that. So we don't have to design for a specific thing. What else is on the timeline? Do we care about how posts are ranked? Is it like chronological or do we try to you can assume like, there.\nRead Maelstrom: Is a black box service called Ranking which will rank the post for you.\nEpic Ibex: Okay, so if I'm a user with this, I could create a post, and if I'm the other user, I would open my timeline and I expect to see something. Okay. Do you want to see anything other than this? These seem like I think these are.\nRead Maelstrom: Good set to start with. These are good set of requirement to start with.\nEpic Ibex: So number of users, how many are we talking about here? Like daily active users? Yeah.\nRead Maelstrom: So daily active users are in terms of like 50 million active user.\nEpic Ibex: 50 million users active. So I'm assuming there's like average one post by daily active user per day. Is that okay?\nRead Maelstrom: 2.5 or something? Yeah, 2.5.\nEpic Ibex: Okay. Let's round up to just to have the number then. Is there any limit on how big your post could be? Can I put one MB of text into my post? Hello? Can you hear me? Hey, can you hear me?\nRead Maelstrom: Hey, I can hear you. Sorry, I lost my connection.\nEpic Ibex: Yeah, no worries. So I added a couple of questions in the meanwhile. So for the post site, is there like a system limit that we want to enforce?\nRead Maelstrom: Yeah, that's a good question. Again.\nEpic Ibex: Yeah.\nRead Maelstrom: One KB. Let's assume that it should be less than one KB.\nEpic Ibex: Okay, so post size is like max. One KB of text. For simplicity, I'm assuming this is like ASCII text, just English so I don't have to worry about any code number of bytes in them, how many friends, and an average a user would have so that I know.\nRead Maelstrom: Yeah, maybe 200.\nEpic Ibex: 200 friends per user. So there would definitely be some delay between as a user posting something and then my friend saying, do we want to provide any sort of rough SLA for that?\nRead Maelstrom: No, we will try to be as real time as possible. We don't have any hard SLA.\nEpic Ibex: Okay. I'll just say as quick as we could. Reasonably quick. Okay, I see. Kind of one more question. So I know how many users, I know how many friends one might have. Do posts expire? So I create a post and it appears on someone's timeline. Can they keep scrolling to years and years back? Do we keep all the posts forever?\nRead Maelstrom: That's a good question. Maybe we will like to keep it for past three years.\nEpic Ibex: Okay, got it. Is there anything that's important that you want to see in this particular system? Like any guarantees?\nRead Maelstrom: Yeah, so basically we would like to focus more on availability.\nEpic Ibex: Okay. Yeah, definitely. As a user, it makes sense I get to see something than seeing very accurate things. So consistency is less important for me. I see. Okay. So just to get a rough idea on how much data we are storing. So we said two posts per user, 50 million daily active users. So that's like 100 million post times we said one KB, right? So that's like 100 GB worth of data like post storage and it's all like text for now. So that's what we are looking at. And this sort of a system I would imagine the traffic is sort of spiking. What I mean is there are lot less people browsing say in the middle of the day while at work or in 02:00 A.m. Or so. Most of the traffic would either come during commute hours or during evening when people get back home. Even though we have so many users, I'm thinking the traffic would be very spiky on creation side. So there's like 100 million posts that are getting created, right? So if it was all even traffic distribution, that is like 1150 ports but I'm assuming ten x in Spiky time. Sorry, go ahead.\nRead Maelstrom: Yeah, rather than ten x, let's assume for three x so the spike is like three x.\nEpic Ibex: Okay, so it was something like 3.5 round up to posts per second being spiky. And this is like as a user I would post two posts but I would probably see like 100 posts. So this is very read heavy system because most of the time people are reading a lot then creating new posts. So we definitely want to probably cash a lot of these posts to serve them, especially given like a constraint is not that important for us. Okay, so this is what I'm thinking spiky we know sort of on the read path we did say spiky and the post path, right? So on the read path it would look something like do we want to support people going to someone's profile and then see just their post? Or we are only caring about creating our timeline.\nRead Maelstrom: That's important. Yeah, just this timeline.\nEpic Ibex: Okay, so we only care about just looking reading timeline. So for now, people can't query random user post history. So we are only carrying about timeline and this would be n number of posts that need to be read. So it is some sort of a Paginated API that people keep calling and we need to get timeline quickly. Okay, I'll start with this and then see if I need to add something more. We definitely want to think about metrics that we want to track and but I'll get to that a little later. Let's see. So thinking about write path and read path I'm thinking there should be something like post service so this would own all the new post creation and referencing any posts that we would need. And since we said it is very read heavy, we want to separate out like the timeline because consistency is not very important for us which means we could probably pre compute lot of timelines beforehand so that we could quickly serve the timeline and give user better experience. I could create a post it might go sit for few seconds or a minute, I wouldn't worry but when I open my app or my account I want to see something on my timeline quickly. So this is how it would look. Right, so from my end user they are basically doing some sort of a post call. I don't see anything special with this. This could be just some rest call where you are creating a post and you are reading timeline you basically make a read call and then get your timeline one page at a time for scrolling through. So that means they could come through some sort of a gateway and since I don't see anything other than rest calls here I would probably offload a lot of heavy lifting to I would probably use something like an application gateway like L Seven. I don't need any sort of network like L Four type of a thing which means it could look at my Http header, could do traffic shaping kind of a thing logs metrics that could come for free. What do you mean by traffic shaping?\nRead Maelstrom: Okay, understood.\nEpic Ibex: Okay yeah so it could know users identity also looking at the headers and then could do intelligent things like decisions could be offloaded to us. So this is how it looks. Right, so first flow is creating post. Obviously we are generating like 100 GB of text a day which is quite a lot we want to purchase so we need some sort of a database but yeah, single database would run out quickly so it needs to be sharded somehow. Right now we are not worrying about querying post by particular user so it could be any sort of a database? It could be we don't have table wide scan type of a use case for now. Even in the future if you query by user it is very limited to a user level query. Right. Since availability is more important we don't care too much about consistency, it doesn't have to be relational but I'm wondering if that would make our life easier. So I don't have any strong preference for DB. It could be some sort of a NoSQL store or even relational just to start with I'll probably stick to the charted relational database to see if there are more use cases in the future. But it's very weak requirement. So even if it's a relational database, I would probably have to start it with something. Maybe all single users forced go to one shard, but yeah, that could technically create we might probably one. Reason I'm a little worried is if a user average could be two posts per day so a user is creating continuously, would that cause an issue? Probably not. We might have some sort of a quota too. So yeah I would probably go with like a listener database is not very strong requirement but I'll go with it for now to probably make my life easier with additional requirement. But I would definitely be charting by user ID. So the way it would work would be user would since I said I would go with Rest, it would look something like I would get a body of post and it would return some sort of post ID. I don't know if that would be really required. I mean it would create a new entry in the database with this post data. I'm not putting that anywhere right now. Yeah, I'll see what needs to be done because for Simon we were thinking of pre populating like pre computing these posts, right? So currently the database is basically referencing is basically acting as a persistent store for my post history and maybe I might be looking up post IDs through my post service. So post service on a post call, creating a post, it goes creates an entry in the database and a user is done. And after that.\nRead Maelstrom: How does the post API looks like? What would be the input parameter in it?\nEpic Ibex: Yeah, so the body would be it could be pretty simple, right? So the post is a user creates a post not targeted towards any single person. It's like not a receiver. So they're just creating their own post. They're already authenticated. So all we need is probably just the body with the body and then the text I don't see there might be some sort of a metadata that we want to track like which client device they use, things like that. But the most important thing is just the body.\nRead Maelstrom: Understood.\nEpic Ibex: We could probably timestamp it at the service side too, just for keeping a service, not be gameable. Like the post date could all be derived on the API side, on the service side. So bare minimum is just the body. We might get some sort of a metadata to just for our metrics, things like that. Later on on Twitter it says tweeted by iOS, device location, things like that. But that's not mandatory apart from body. So kind of it's probably body. We make an entry, we definitely want to on the reset. We don't want to go find all the posts created by my friends runtime because it would be cluster wide query and you need to aggregate it somehow. So that's very complicated. So we definitely want to make but we also said it doesn't need to be consistent. I mean, my friends don't have to see my post for next 1 minute. That's okay with us. So this service could also make an entry into some sort of a queue. This is basically like an asynchronous job queue. It would basically say hey, I created a post with this post ID. It could just be that. In that case we would talk to the post service for reading the post. Do we allow users to go edit their post or do we not we.\nRead Maelstrom: Don'T don't worry about anything for this, okay?\nEpic Ibex: For now I want to keep it simple. So all it is doing is it makes an entry into some sort of asynchronous queue. We don't need any sort of like we said, envelope is more important, consistency is not so important. We could do some sort of ordering even on timeline service or on the client side using timestamp. But what I mean is if two of my friends post, if the second one appears first on my timeline and quickly followed by the first one who posted it's not probably a huge deal, right. Because post are their own identity, they are not like common threats. So the queue doesn't have any sort of we only need a guarantee of maybe like guaranteed delivery at least once. But no ordering guarantee is required. So which means it could be like the requirements on our queue is not very strict. So we have a bunch of workers on the other side. What they do is they look at a post, I could put a body here. So the body could be something like I have a post ID of this because they don't care what the content is. It could just be that and like the author ID, something like that, who posted it because I want to populate this post in my friend's timeline. Right. So I think we would need at least this amount of information in my queue and we try to keep this queue as current as possible which means there's a huge backlog. We could independently scale on the worker side and then try to pull this post quickly. But that probably has effects on the other side. So we need to know who is friends with who, right? So we need to know something like friend service. So the query would be like this, I give someone name and then I ask them. Yeah. So for pre computing timeline I could do it in two ways. One, it could be like a pull model. One or it could be a push model. So going towards push model which is I create a post, it goes fans out into all my friends timeline. That's one way of doing it. The other way would be a pull model where for every active user I would go ask post from every one of my friends. Something like that. Let's see. So do we expect to have a magnitude of orders of more number of users on our platform compared to like our daily active user?\nRead Maelstrom: No, there won't be any case but there can be a case that some user might have a lot of friends or a lot of people they are connected to.\nEpic Ibex: Right? Yeah.\nRead Maelstrom: Than the average.\nEpic Ibex: Yeah, definitely. Right. Like celebrities. Like the other one I was thinking about was if we have 50 million daily active users but if our user base is like 500 million users, I probably care about creating timelines for 50 million users are quite quickly and other 450,000,000 users could have timelines but very stale timeline that we try to populate when they log in. Makes sense.\nRead Maelstrom: Yeah, that's a very good optimization.\nEpic Ibex: Yeah, we could go with that. Which means the post service needs to support sort of a user ID based batch call. Okay, yeah, I will avoid this queue and then I'll try to go full model because that might be bit more efficient, like my intuition. So yeah, what we would do is we have a bunch of users that would periodically maybe they are working on a partition of users. Like if you partition our users into different buckets, they're working on their own bucket and we always prioritize daily active users, not all users. Right. So in that case they would iterate over a bunch of daily active users. They ask friend service for list of friends that they have list of people that they are friends with. Right. So for every user they would get a number of users and we make some sort of a batch query to the post service. We get all of their posts. The workers would merge those and create newly created newly created timeline and then push it to a timeline service. So before doing that it probably has to ask what was like the last so if you consider timeline as sort of like a continuous log, you'd ask what was the last point? And then you query the post service saying these are all my friends, give me all the posts created after this point in time.\nRead Maelstrom: Understood? Understood. So there would be some frequency at which there would be some frequency at which these workers create these friend services service and the post service.\nEpic Ibex: Yeah, if I go with the pool model that's how I'm envisioning. The workers would have two sets of users. Two sets of users, two sets of pools. One is like low latency one which is only looking at daily active users. The other one is going iterating very slowly looking at all the users. So we do create some sort of a timeline even for inactive users, but we always focus on daily active users. Right, so that's one way of doing it. This is sort of like a full model. So the worker is doing most of the heavy lifting where they go over a set of users art friend service for their friends art timeline or maybe they might since they are doing this work they probably know what was the they could have that checkpoint saying where the previous timeline was. So they have some sort of a checkpoint DB all it would be a timestamp for a given user ID. They don't find it, they could ask the timeline service for that information probably. So for every user they would go do a batch call to single batch call probably to a post service. But that would fan out to the entire shard of databases. Right? Yeah, that could be problematic too. So I do want to optimize for our daily active users but I also don't want to have like cluster wide queries on the DB. Right, okay. I'll give it back to push model. For simplicity. It does feel like pull model would be more efficient but there would be a lot more challenging at least if we think about this as an MVP it would be a lot more challenging to do that and the database would start becoming a big bottleneck. Yeah. So post service going back post service creates a persistent entry, makes the queue just a post ID and the user service and the user starts pulling it. And for every user it pulls, it asks friend service asking who is this user friends with? It gets in one called a list of users. Some users could have quite a lot of friends, which is possible. We will handle it little later. So for now we're assuming it is manageable. So we ask set of friends. So we duplicate this post ID. We basically create Tuples saying this end user gets to see this post ID and then it tries to build this and then basically report back to the timeline service. And the way timeline service should look like is we are optimizing it on the read path, right? Like it is a read heavy system. We want user to see something as soon as they log in. We don't want query latency or anything. And timeline is nothing. But it's like a log file, it has a bunch of entries that users go through. So that means my database schema is basically like you can think of it like a queue per user, it's basically like a lock per user or a queue per user where every user ID has a list of post ID.\nRead Maelstrom: So you will create this for only the active users. Right, so the people who are active.\nEpic Ibex: No, I do want to create it for every user. So I wanted to do sort of a pool model. The reason being even if a user is not active we don't want to give them bad experience that would make them not visit our platform. So we want to show them something but we probably want to see if we can maybe store less stuff on their timeline or do some sort of optimization sacrificing their timeline. Right, but for now, no, everyone gets their timeline because we are doing push notification push sort of a timeline building strategy. So a non active user might be friends with someone who created a post and that user would have both active and inactive users. Which means when the worker would do the same work and build both active and non active user timeline. But the timeline service could be intelligent to reduce its storage on its side. So for now we are assuming, let's say it's all chronological post. We'll add ranking little later probably into the timeline service. But for now the worker would do this sort of a fan out given post fanned out into all their friends and then report to the timeline user and the entry would be it's basically asking them, asking timeline service to sort of do like an upend to a loss type of thing. Right? I say this user ID one append these ten posts to their timeline, something like that.\nRead Maelstrom: Okay, so the post service user posts something and this post service will write into the database and will also write into the queue.\nEpic Ibex: Yes.\nRead Maelstrom: And from queue worker find out which user post what worker does user will query from friend service.\nEpic Ibex: Yeah. So the post would look something like post ID and the author, right? So it has to do a query to friend service asking who is this particular author friends with because this post should be duplicated or added to all their friends timeline. Right? So a given author's post that query would result in number of friend IDs. So worker would basically report back to the timeline service in some sort of a batch call saying these are like the append will be this is like add so user ID one add these set of posts, it might have one or more posts depending on how often we run. If the worker is running once an hour, more likely that every user will have more than one post. But if we are running quickly, like if we are running in real time pulling from the queue, it's most likely going to be like the author. Each of their friends would probably see this one post like call made to the timeline service. But it could be a bad call with multiple user append operation being reported. So we don't end up doing millions of calls. Right, so it makes a call to the timeline service even if we have a failure, we are reporting post IDs. Right? So timeline service could sort of do this deduplication. Let's say we made call to timeline service with 100 like a post getting fanned out to 100 users like the call failed or some sort of an error happened on timeline service. It updated 50 users but not for the other 50. The worker will retry the call but the timeline service would realize when it is trying to update that oh, I already have it or it could add to the queue but the client could be intelligent to not show duplicate post item, something like that.\nRead Maelstrom: Got it. So how does your API for timeline service looks like the API from users point of view, how your data looks like in friend service and how your data looks like in your post thing.\nEpic Ibex: Yeah, so I'll go one by one friend service, right, so this is a black box. Black box right now so the calls are going to be who is this author friends with? If we have some sort of a recommendation. We also start probably doing who is friends with my friends, things like that. So it's all like relationship based queries. So the entire thing is like a graph. So we could use some sort of a graph DB or at least the schema should be some sort of a graph. Like there are very well known off the shelf, well tested graph DB. So we should probably use something like that where the queries about relationships are very efficient. We're not doing lookups and then trying to build this relationship from a tablet database. So I'll probably lean towards something like a graph database, which would give us flexibility of having second level, third level sort of a query. Third level relationship sort of a query. Yeah. So it would be some sort of a graph database. On the post DB it is very tabular. So every entry is basically a post ID. Like the content and some sort of metadata that we probably need to keep. Right, so it's very tabular, there's nothing abnormal about it. Yeah, post ID and then we also need author, something like that.\nRead Maelstrom: Do you also need a timestamp?\nEpic Ibex: Yeah, I meant like the metadata has all sorts okay.\nRead Maelstrom: In metadata.\nEpic Ibex: Okay, yeah, timestamp is important. So yeah, I would also have whole bunch of metadata in here like saying where the post was made from, things like that. So this is very tabular data. Right. So it could be a relational database. But as I earlier said, we don't need consistency guarantees. So it could as well. Like if that ends up being scaling bottleneck, NoSQL stores would work too pretty easily. Something that support. So if you go with some sort of a NoSQL store that is like a document, you can view this as a document. Also like a JSON document. Not very nested. At least now with all these fields and on the timeline service, let's see timeline service. We want it to be pre computed. We want to add new posts to it, but probably drop old posts to it. Right. And it should be able to answer paginated queries. Like the user would likely be scrolling down to a few days, which means the API should support saying give me timeline starting from probably this time stamp. And we could have a post of post ID need not be post themselves. We don't have to duplicate the data. So it would be like a list of force that we add append on one end, then remove from the beginning. Remove from one end. Do we need to remove maybe for the storage? We probably want to remove old ones, keep adding the new ones. Yeah, so that's how it should look like. Let me write the schema here. So it would definitely be a get look, something like get timeline. We know who the user is. You can't query any user timeline, so we don't have to have that as a URL. Parameter I'll write it on the slide over here so it would be a get call on an endpoint, something like a timeline and we need to have an optional parameter saying last timestamp or something like that last time stamp or some sort of a timestamp or from would probably be more readable right? So we would have an optional parameter that says give me from this timestamp understood. Which would point to it could be a timestamp or it could be some sort of an index that a timeline tells. Like whenever it gives a list of post ID. Each one could have an abstract, like an ID. Maybe timestamp makes a lot more sense because the client can do some sort of ordering if the timeline is not doing strict ordering. So the timeline then it is storing post ID and then timestamp two, which means it should be able to quickly look up. If you assume this has a list, it needs to quickly look up from a given timestamp. Right. This is what I would go with. So we could do this in multiple ways. So if we end up using some sort of a relational database, the schema would be there will be a column called so there'll be user ID column and then a column for timestamp and the post ID and the query would be and it would be indexed on user ID. So every time we query it would be like every time it's in the context of a user ID, we say this user ID give me from this timestamp. So you could use a relational database that way. But we don't need availability. Not availability, the consistency guarantees even if we lose the timestamp, we could probably rebuild it quickly with some time delay. There are data stores that support this list, sort of semantic like redis is one of them that I'm aware that we could probably use that too. So if you go with redis there'll be millions of lists feed by user ID and it also has a sorted list so that makes it even easier right? So if you go with something like a redis it's very fast, most of it could be kept in memory because the data is very small just post ID and timestamp right? We could keep a lot of things in memory. The query would be look up this list by the user ID that is reading the timeline and sorted list. We would say start from this key and we would tell redis to sort using this timestamp. Got it either way but I would lean towards Redis mainly because it's way cheaper it's way faster to run race on and we have a flexibility of even if we lose it, we could rebuild it rather quickly. Redis also has a way of persisting its data, so it would only lose data for few seconds depending on how quickly we are persisting the data. So yeah, we would get better trade offs with Redis, so I would probably lean for that if we are using Redis. The other advantage is initially I said workers need to know checkpoint like some sort of a DB to know checkpoint for each user that was push pull based. Yeah, never mind. So yeah, users are dumb, they get a post, they fan it out to multiple users, push it to timeline service and forget about it. Yeah, so this is how my get call would look this is how the post would look on the client side. The flow would be while creating it pretty straightforward. You open whatever part of the app to create a post you type it up, you press send, you make a call to post service, you forget about it. Next you go to your timeline. The call would be if it is a fresh install app or like if it is a fresh app, it will basically or you just open your app, it will make a get call and say from current time or it won't give any time since I said it's optional. It would just say get timeline and that's it, right? Yeah. Depending on different clients would have different memory requirements so it could be probably like Max saying return me only ten more if you are that seems important. So it would say from here return me Max this much. We're not supporting any sort of filter right now so we would just avoid like we don't do that. Yeah, so the get call would look like that both are like optional. You open your app, we make a call to timeline. Let's say I get ten post the client would only get posted right so it needs to get the post data. It would also make a get call it would do bunch of get calls to the post service. Could be a batch call to it could give like list of post IDs just for like the especially for mobile clients you don't want to do like 100 get calls. So it could give a list of post IDs and then could receive list of content. It's a trade off basically of how much memory you want to use on the device, how many calls you want to make. We could make a batch get call basically so that the client would do like timeline service wouldn't carry it all it knows post IDs and timestamps.\nRead Maelstrom: Cool. So let's pause here. I understand the overall idea. Let's pause for the feedback in terms of like 45 minutes interview this question might have been a little more wider you did right like talking about the actual functionality rather than talking about distributed system and scalability. I think important part is to cover the functionality and then if time permit, talk about scalability, fault tolerance and those things. It's also onus of the interviewer to make sure that he asks those questions if those things are important. So I don't think you would be penalized for not covering scalability at least for this question in 45 minutes interview. So don't worry about that. But yeah, I really like your functional requirement gathering. I really like the scoping out whether being friend with user is part of design or not. I like your non functional requirement gathering. I think one extra thing which you could have done is the storage estimation. So you did estimated the traffic, like how much would be the traffic. But along with that estimating the storage could also be useful because then you can create a storage pattern. Like most recent data would be in memory, a week's worth of data would be in some Dynadb or SQL and then like archival data could be in S three or something because you won't be using archival data that much. Right, so that architecture could have fit if you analyze the storage like storage requirement is huge, but then come up with observation that only recent post would be viewed more frequently as compared to the older post. So that architecture could fit in, right? Yeah, another suggestion from my side would be like once you are done with the traffic and storage estimations right, do try to talk about the APIs first, like high level APIs before jumping into the high level architecture because then it's easy for you also. Right, so these are the APIs, these are the services which support these APIs. Basically it also brings a structure to the thought process because you discuss the contracts of your service or your system and then you go and build those contracts by coming up with a high level architecture, right? Yeah, I think that could be one thing which you can think about and I'm focusing on what could be improved. I have listed down all the positives and feedback anyway. So I am just focusing on what could have been done better. So that was about the storage estimation. Are you aware about CQRS architecture command query responsibility segregation?\nEpic Ibex: I know at very surface level of it, everything is like a string, right? Like you act on the updates that are coming in, right.\nRead Maelstrom: Maybe you can just give it a read. I think that pattern fits here perfectly. That will help you. I think what you did is very close to secures but if you can just go through some blog or something you might be able to use similar wordings or you might be able to articulate your solutions better in terms of timeline or some other examples where streaming of data is important. So yeah, my suggestion would be maybe to give it a read. Another small suggestion is you had a multiple choice, right? So either your post service could have write into a sync queue or your database can introduce a change log which can be written into queues. So the second option makes it more event driven, right, less coupled with your post service. So your post service doesn't need to be aware about queue or something. It just takes one update and right into the queue and then that event can trigger rest of your flow. So it makes your system more event driven. I'm not saying that is a better solution, but that is one of the options you can think about.\nEpic Ibex: Yeah, I like that. But from relational database I'm not very familiar with if there exists anything that actually follows the change log and then writes to the queue. Is there something like that? I've never worked with that on AWS.\nRead Maelstrom: I think there should be even in case of relational DB. Okay. Those are the only improvement things, I think. Rest of the things you did really well. I like the way you did the API design. You discuss about what are the important parameters and what are the good to have parameters. Right. You also showed attention to details by asking whether user can go back and edit their post. You showed the attention to details throughout your design, right. Even during the requirement gathering. I don't see any candidate covered these requirements to this extent. You covered pretty much everything which I can think of. So I think for that you should give a pat on your bat back. You also come up with high level components for different high level operations like timeline service, user service and everything. So you talked in terms of services and not in terms of monolithic block, which does everything. So yeah, I think you did pretty much everything. Well, apart based on this interview, it would have been a definite higher call.\nEpic Ibex: Okay, thanks.\nRead Maelstrom: Yeah, you were saying something, sorry to cut you in between.\nEpic Ibex: No, I mean you answered my question.\nRead Maelstrom: Okay.\nEpic Ibex: Yeah. Next week I have an interview coming up with there is a social network called next door. I'm not sure if you know, so I have interviewed them. I'm guessing there with questions similar to that. So thanks for asking. Question.\nRead Maelstrom: Got it. And are you applying for? Which position are you applying for? IC manager, team lead?\nEpic Ibex: So I'm interviewing for senior engineer role.\nRead Maelstrom: Okay, senior engineer, got it.\nEpic Ibex: Cool.\nRead Maelstrom: I think my only observation is like startups do like people who talks a lot about business model and things because you need to wear a lot of hats in startups. So maybe if you can spend like a couple of minutes about what is important metrics for us to track or the growth or something like how we can hack the growth, those kind of things. I'm not sure. For every question you can talk about these things, but if it comes naturally to you, do talk about those things. Do talk about how you can the way you did. You said that for POC I can do this and then as a follow up, I can do the full fledged solution. So you can also not POC. The word you use was mu. I think you're already doing things in terms of startup like getting things out of door and then going and expanding it for more efficient and more scalable solution. But even if you can talk about business metrics and growth and those things, I think startups do like those questions or that thought process.\nEpic Ibex: Yeah, I wanted to apply for Facebook, but I'm not sure if the hiring is going on right now. But yeah, I would try to apply there but yeah, I don't have interview on Facebook right now.\nRead Maelstrom: They are still hiring for critical roles. I think they are still hiring for critical roles. I do see they are still hiring for critical roles. But I think the better time to apply would be a couple of months down the line because there is still some conversation about second layoff.\nEpic Ibex: I was affected by recent Google layoff, so I need a job to apply it everywhere.\nRead Maelstrom: Understood, I'm really sorry to hear about that.\nEpic Ibex: Hopefully I find something better. So there is that silver lining.\nRead Maelstrom: Are you also getting the Visa crunch or something? Visa timeline crunch or something.\nEpic Ibex: Yeah, I do have it. So that's why I'm not being very selective. I interviewed with one of a Series B startup which seems very interesting startup, very data intensive product. I almost have an offer, so hopefully that turns out to be I only have one half an hour meeting with their VP. After that they said it's tougher stay, so hopefully I'm there. But I'm not being very selective. Like who knows how things go from interviewing everywhere.\nRead Maelstrom: I think Uber is hiring extensively throughout globe. So maybe if we have any connection at Uber, maybe that's something you can explore.\nEpic Ibex: Okay? Yeah, I've not looked at Uber. Yeah, I'll try to see if I know anyone there. Okay, cool. Awesome.\nRead Maelstrom: I will write all those things also in feedback.\nEpic Ibex: Yeah, that would be great. Thank you.\nRead Maelstrom: Yeah, bye bye bye.

We know exactly what to do and say to get the company, title, and salary you want.

Interview prep and job hunting are chaos and pain. We can help. Really.